Power management and distributed audio processing techniques for playback devices

ABSTRACT

Aspects of the present disclosure relate to power management techniques for reducing the power consumption of playback devices. Additionally, aspects of the present disclosure related to distributed processing techniques for processing audio across two or more processors.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority under 35 U.S.C. § 371 as a nationalstage application of PCT Application No. PCT/US2020/045465, filed Aug.7, 2020, which claims priority to U.S. Provisional Patent ApplicationNo. 62/884,966, filed on Aug. 9, 2019, titled “Distributed ProcessingArchitecture for Playback Devices,” each of which is hereby incorporatedherein by reference in its entirety.

TECHNICAL FIELD

The present technology relates to consumer goods and, more particularly,to methods, systems, products, features, services, and other elementsdirected to media playback systems or some aspect thereof.

BACKGROUND

Options for accessing and listening to digital audio in an out-loudsetting were limited until in 2003, when SONOS, Inc. filed for one ofits first patent applications, entitled “Method for Synchronizing AudioPlayback between Multiple Networked Devices,” and began offering a mediaplayback system for sale in 2005. The SONOS Wireless HiFi System enablespeople to experience music from many sources via one or more networkedplayback devices. Through a software control application installed on asmartphone, tablet, or computer, one can play what he or she wants inany room that has a networked playback device. Additionally, using acontroller, for example, different songs can be streamed to each roomthat has a playback device, rooms can be grouped together forsynchronous playback, or the same song can be heard in all roomssynchronously.

Given the ever-growing interest in digital media, there continues to bea need to develop consumer-accessible technologies to further enhancethe listening experience.

BRIEF DESCRIPTION OF THE DRAWINGS

Features, aspects, and advantages of the presently disclosed technologymay be better understood with regard to the following description,appended claims, and accompanying drawings.

FIG. 1A is a partial cutaway view of an environment having a mediaplayback system configured in accordance with aspects of the disclosedtechnology.

FIG. 1B is a schematic diagram of the media playback system of FIG. 1Aand one or more networks.

FIG. 2A is a functional block diagram of an example playback device.

FIG. 2B is an isometric diagram of an example housing of the playbackdevice of FIG. 2A.

FIG. 2C is a diagram of another example housing for the playback deviceof FIG. 2A.

FIG. 2D is a diagram of another example housing for the playback deviceof FIG. 2A.

FIGS. 3A-3E are diagrams showing example playback device configurationsin accordance with aspects of the disclosure.

FIG. 4A is a functional block diagram of an example controller device inaccordance with aspects of the disclosure.

FIGS. 4B and 4C are controller interfaces in accordance with aspects ofthe disclosure.

FIG. 5 is a functional block diagram of certain components of an exampledevice employing a distributed processing architecture in accordancewith aspects of the disclosure.

FIG. 6 is a functional block diagram of a module in accordance withaspects of the disclosure.

FIG. 7 illustrates power states through which the playback devicetransitions to facilitate lowering power consumption in accordance withaspects of the disclosure.

FIG. 8 illustrates operations performed by the playback device duringinitial power-up or activation of the playback device in accordance withaspects of the disclosure.

FIG. 9 illustrates operations performed by the playback device after afirst processor(s) and a second processer(s) of the playback device havebeen initialized in accordance with aspects of the disclosure.

FIG. 10 illustrates operations performed by the playback device when thefirst processor(s) and the second processer(s) are in an awake state inaccordance with aspects of the disclosure.

FIG. 11 illustrates a distributed audio processing environment in whichprocessing operations associated with the playback of audio content aredistributed among multiple processors in accordance with aspects of thedisclosure.

FIG. 12 illustrates a logical representation of processing operationsperformed by the first processors(s) and the second processor(s) of theplayback device in accordance with aspects of the disclosure.

FIG. 13 illustrates operations performed by the playback device tofacilitate distributed audio processing in accordance with aspects ofthe disclosure.

FIG. 14 illustrates further operations performed by the playback deviceto facilitate distributed audio processing in accordance with aspects ofthe disclosure.

FIG. 15 illustrates an example state diagram of various operating modesfor a playback device in accordance with aspects of the disclosure.

The drawings are for purposes of illustrating example embodiments, butit should be understood that the inventions are not limited to thearrangements and instrumentality shown in the drawings. In the drawings,identical reference numbers identify at least generally similarelements. To facilitate the discussion of any particular element, themost significant digit or digits of any reference number refers to theFigure in which that element is first introduced. For example, element103 a is first introduced and discussed with reference to FIG. 1A.

DETAILED DESCRIPTION I. Overview

Consumers typically expect battery powered devices to have a longruntime before the battery needs to be recharged. Consumers generallyhave different runtime expectations for different types of batterypowered devices. For example, consumers may expect general-purposecomputing devices (e.g., a laptop, a smartphone, or a tablet) to have ashorter runtime between 8-10 hours and expect more specialized devices(e.g., a wireless game controller, a pair of wireless headphones, awireless keyboard, or a digital camera) to have a longer runtime of atleast 20 hours.

The differences in runtimes between portable general-purpose computingdevices and more specialized devices results, at least in part, fromdifferent processing architectures. For example, general-purposecomputing devices typically execute a general-purpose operating system(GPOS), such as WINDOWS, MACOS, ANDROID, IOS, or LINUX, that generallynecessitates a power-hungry processor. For example, a GPOS maynecessitate use of a more complex processor that supports memoryvirtualization. By employing a GPOS, these general-purpose computingdevices can be programed to perform a variety of sophisticatedoperations. In contrast, specialized devices typically do not supportsuch sophisticated operations enabled by a GPOS. For example, a pair ofconventional wireless headphones that only wirelessly communicate viaBLUETOOTH with a smartphone that directly provides the audio content forplayback does not need to: (1) wirelessly communicate with remoteservers over a more sophisticated wireless network such as a WIFInetwork; and/or (2) perform one or more authentication operations tosuccessfully obtain access to media content on the remote servers. As aresult, these specialized devices can typically employ a special-purposeoperating system (SPOS) that is capable of being executed by simplerprocessors that consume less power.

As consumer demand for additional functionality in specialized devicesgrows, the conventional processing architecture for such specializeddevices is insufficient. In particular, the additional functionalityconsumers desire may require a shift from only using an SPOS to using aGPOS. One challenge in designing a specialized device with such enhancedfunctionality is that the shift from an SPOS to a GPOS may require theuse of more power-hungry processors. As a result, the long battery lifeconsumers expect in such specialized devices may no longer be easilyachievable. For example, consumers may expect a wireless headphone thatonly plays back audio received over BLUETOOTH to have a runtime between20 and 25 hours. While such a long runtime is easily achievable byemploying an SPOS executed on a simple processor, the runtime may besubstantially reduced (e.g., to only 5 and 8 hours) when thearchitecture is replaced with more complex processors executing a GPOSto enable more complex features (e.g., audio streaming over WIFI from amusic streaming service provider). While such a short runtime may beperfectly acceptable to consumers for general-purpose computing devices,such a short runtime is intolerable to consumers in a specializeddevice.

Accordingly, aspects of the present disclosure relate to a distributedprocessing architecture for specialized devices (and/or systemsemploying one or more specialized devices) that integrates a moresophisticated processor capable of executing a GPOS withoutsignificantly increasing power consumption. In some embodiments, thedistributed processing architecture includes a low-power processor thatexecutes an SPOS and a high-power processor that executes a GPOS. Inthese embodiments, the low-power processor may perform the basicoperations of the device such as obtaining information from one or moreinput devices (e.g., capacitive touch sensors, buttons, etc.), providingoutput signals to one or more components (e.g., amplifiers, etc.), andcontrolling the power state of other components (including thehigh-power processor). The high-power processor may be invoked by thelow-power processor only for high complexity tasks that could not beeasily performed (or not performed at all) on an SPOS executed by thelow-power processor, such as authenticating with a remote server overWIFI to obtain access to media files. Accordingly, the low-powerprocessor can put the high-power processor into a low-power state(including completely turning the high-power processor off) duringperiods where no higher complexity tasks are being performed. Thus, thehigh-power processor executing the GPOS may, in at least some respects,function as a co-processor to the low-power processor executing theSPOS. As a result, the architecture enables a specialized device toimplement more complex features that require a GPOS withoutsubstantially reducing battery life relative to less capable specializeddevices in the same market segment.

The architecture described herein stands in contrast to conventionalmulti-processor architectures where a high-power processor executing aGPOS handles the low-level tasks of the device (including power statecontrol of other devices) while specific tasks that can be moreefficiently handled by specialized hardware are offloaded to aspecialized co-processor. For example, a conventional device may employa high-power processor executing a GPOS that offloads some mathintensive tasks to a digital signal processor (DSP) that is bettersuited to handle such tasks efficiently. With such an architecture, thehigh-power processor still needs to remain powered even when the deviceis only performing tasks suited for the DSP to, for example, performbackground tasks (e.g., controlling user interface components, providingobtained information to the DSP, etc.). As a result, the high-powerprocessor still consumes a considerable amount of power even when thedevice is only performing tasks particularly suited for the DSP.

The distributed architectures described herein may be advantageouslyemployed in any of a variety of specialized devices. For example, thedistributed architecture may be implemented in a playback device. Theplayback device may comprise one or more amplifiers configured to driveone or more speakers. The one or more speakers may be integrated withthe playback device (e.g., to form an all-in-one smart speaker) orseparate from the playback device (e.g., to form a smart amplifier). Theplayback device may further comprise one or more network interfacecomponents to facilitate communication over one or more wirelessnetworks. For example, the one or more network interface components maybe capable of wirelessly communicating with a first computing deviceover a first wireless network (e.g., cellular network and/or a wirelesslocal area network (WLAN)) and wirelessly communicating (e.g.,simultaneously wirelessly communicating) with a second computing deviceover another network, such as a BLUETOOTH network (e.g., a BLUETOOTHCLASSIC network, a BLUETOOTH LOW ENERGY (BLE) network, etc.). Theplayback device may further comprise a plurality of processingcomponents configured to execute instructions that cause the playbackdevice to perform various operations. The plurality of processingcomponents may comprise low-power processor(s) and high-powerprocessor(s) that are constructed differently from the low-powerprocessor(s). Additionally, the low-power processor(s) may execute adifferent operating system than the high-power processor(s). Forexample, the high-power processor(s) may be configured to supportvirtual memory (e.g., an abstraction of the available storage resources)and execute an operating system that may at least partially employvirtualized memory, such as a GPOS. In contrast, the low-powerprocessor(s) may not be configured to support virtual memory and executean operating system that does not require virtual memory support, suchas a Real-Time Operating System (RTOS) or other SPOS.

In some embodiments, a subset of the operations to be performed by theplurality of processing components may only be practically implementedon processor(s) executing a GPOS, such as managing authentication with aremote server to obtain access to audio content, while the remainder ofthe operations may be practically implemented on processor(s) executingeither a GPOS or an SPOS, such as reading sensor or playing back audiostored in a local memory. Given that a processor that is suitable for aGPOS (e.g., supports virtual memory) consumes more power than aprocessor that is not suitable for a GPOS (e.g., does not supportvirtual memory), the high-power processor(s) executing a GPOS may onlybe invoked (e.g., by the low-power processor(s)) for those operationsnot suitable for execution on the low-power processor(s) and otherwisekept in a low-power state (e.g., including being completely poweredoff). Thus, the high-power processor(s) may, in at least some respects(and/or situations), function as co-processor(s) to the low-powerprocessor(s). By architecting a playback device such that the high-powerprocessor(s) are only required for certain complex tasks, the high-powerprocessor(s) may be completely turned off (or otherwise in a low-powerstate) without interfering with other aspects of the operation of theplayback device. For example, the high-power processor(s) may beentirely turned off without impacting less complex operations such asplaying back locally stored audio (e.g., stored in a buffer).

It should be appreciated that the particular way in which operations aredistributed between the low and high-power processors in additions tothe particular triggers that cause the processor(s) to change betweenvarious power states may vary based on the particular implementation. Insome embodiments, the playback device may be configured to stream audiocontent over the Internet. In these embodiments, the operationsinvolving communicating with a remote server to obtain the audio contentmay not be practical to implement on the low-power processor(s)executing an SPOS. In contrast, the operations involving playback of thedownloaded audio content may be suitable for execution on the low-powerprocessor(s). Accordingly, in some embodiments, the low-powerprocessor(s) may keep the high-power processor(s) in a low-power stateuntil audio content from a remote server is needed. Once audio contentfrom a remote server is needed, the low-power processor(s) may wake-upthe high-power processor(s) such that the high-power processor(s) canobtain the audio content from the remote server (e.g., by communicatingover the Internet). Once the high-power processor(s) has obtained theaudio content, the high-power processor(s) may make the audio contentavailable to the low-power processor(s). For example, the high-powerprocessor(s) may transmit the audio content to the low-powerprocessor(s) via a communication bus, such as a serial peripheralinterface (SPI) bus. Alternatively (or additionally), the high-powerprocessor(s) may store the audio content in a shared memory that isaccessible by the low-power processor(s). Once the audio content isaccessible to the low-power processor(s), the low power processor(s) maycause the playback device to play back the audio content.

Additionally, in some embodiments, the low-power processor(s) may causethe high-power processor(s) to return to a low-power state once thelow-power processor(s) has access to the audio content given that thehigh-power processor(s) may not be required to playback the retrievedaudio content. In these embodiments, the low-power processor(s) mayplayback the retrieved audio content and, while playing back theretrieved audio content, monitor how much of the retrieved audio contenthas yet to be played back (e.g., how much time is left before playbackof the retrieved audio content is exhausted). When the retrieved audiocontent is nearly exhausted, the low-power processor(s) may wake-up thehigh-power processor(s) from the low-power state such that thehigh-power processor(s) may obtain additional audio content from aremote server to continue uninterrupted audio playback.

In some embodiments, the playback device may be configured to providevoice assistant service (VAS) functionality over WIFI. In theseembodiments, the operations involving communicating with a remote serverto provide a voice input and obtain a response may not be practical toimplemented on the low-power processor(s) executing an SPOS. Incontrast, the operations involving pre-processing of the voice input toremove noise (e.g., remove echoes, remove background chatter, etc.)and/or detecting a wake-word (e.g., an activation word) may be suitablefor execution on the low-power processor(s) executing the SPOS.Accordingly, in some embodiments, the low-power processor(s) may performone or more operations to detect the utterance of a wake-word, removenoise from the voice input, and make the de-noised voice inputaccessible to the high-power processor(s). In turn, the high-powerprocessor(s) may transmit the de-noised voice input to a remote serverand obtain a response from the remote server.

In an embodiment, the playback device includes amplifiers for drivingspeakers and a network interface that facilitates communications with afirst remote device via a first communication link, and with a secondremote device via a second communication link. The playback deviceincludes a first processor(s) and a second processor(s) with differentconstructions. The first processor(s) implements multiple power statessuch as an awake state and a sleep state. In this embodiment, audiocontent is received by the second processor(s) from the second remotedevice via the second communication link for playback via the speakers.An indication is received from the network interface as to whether thefirst communication link can be established between the playback deviceand the first remote device. When the first communication link can beestablished, the first processor(s) is transitioned to the awake stateto facilitate receiving audio content via the first communication linkfor playback via the one or more speakers. When the first communicationlink cannot be established, the first processor(s) is transitioned tothe sleep state to lower playback device power consumption.

In another embodiment, a playback device includes amplifiers for drivingspeakers and a network interface that facilitates communications with afirst remote device via a first communication link. The playback deviceincludes a first processor(s) and a second processor(s) with differentconstructions. In this embodiment, audio information that includes audiocontent is received by the first processor(s) and from the first remotedevice. The first processor(s) generates metadata associated with thefirst audio content. The first processor(s) communicates the audiocontent and the metadata to the second processor(s). The secondprocessor(s) processes the audio content according to the metadata dataand communicates the processed audio content to the amplifiers forplayback.

It should be appreciated that the distributed processing architecturesand/or the power management techniques described herein may beadvantageously employed in specialized devices separate and apart fromplayback devices. For example, the distributed processing architecturesdescribed herein may be employed in any Internet of Things (IoT) device.An IoT device may be, for example, a device designed to perform one ormore specific tasks (e.g., making coffee, reheating food, locking adoor, providing power to another device, playing music) based oninformation received via a network (e.g., a wide area network (WAN) suchas the Internet). Examples of such IoT devices include: a smartthermostat, a smart doorbell, a smart lock (e.g., a smart door lock), asmart outlet, a smart light, a smart camera, a smart kitchen appliance(e.g., a smart oven, a smart coffee maker, a smart microwave), and asmart speaker (including the network accessible and/or voice-enabledplayback devices described above).

Further, the distributed processing architectures and/or the powermanagement techniques described herein may be readily applied to anetwork of two or more devices (e.g., playback devices). For example, afirst playback device that has consistent access to one or more externalpower sources (e.g., a stationary playback device plugged into a walloutlet, such as a soundbar) may house a high-power processor executing aGPOS (e.g., an application processor) and a second playback device thatis power-constrained (e.g., does not have consistent access to anexternal power source, such as a battery-powered playback device) mayhouse the low-power processor executing an SPOS. The first and secondplayback devices may be connected to a common communication network,such as a BLUETOOTH network (e.g., a BLUETOOTH LOW ENERGY (BLE) networkand/or a BLUETOOTH CLASSIC network) or other personal area network(PAN). In this example, the second playback device may off-load tasksthat are unsuitable for execution by the low-power processor executingthe SPOS to the high-power processor of the first playback device (e.g.,via the common communication network). Thus, the second playback devicemay (e.g., when connected to a common network with the first playbackdevice) support complex functions while still having a low powerconsumption (e.g., and a long run-time on battery power).

While some embodiments described herein may refer to functions performedby given actors, such as “users” and/or other entities, it should beunderstood that this description is for purposes of explanation only.The claims should not be interpreted to require action by any suchexample actor unless explicitly required by the language of the claimsthemselves.

II. Example Operating Environment

FIGS. 1A and 1B illustrate an example configuration of a media playbacksystem 100 (or “MPS 100”) in which one or more embodiments disclosedherein may be implemented. Referring first to FIG. 1A, the MPS 100 asshown is associated with an example home environment having a pluralityof rooms and spaces, which may be collectively referred to as a “homeenvironment,” “smart home,” or “environment 101.” The environment 101comprises a household having several rooms, spaces, and/or playbackzones, including a master bathroom 101 a, a master bedroom 101 b(referred to herein as “Nick's Room”), a second bedroom 101 c, a familyroom or den 101 d, an office 101 e, a living room 101 f, a dining room101 g, a kitchen 101 h, and an outdoor patio 101 i. While certainembodiments and examples are described below in the context of a homeenvironment, the technologies described herein may be implemented inother types of environments. In some embodiments, for example, the MPS100 can be implemented in one or more commercial settings (e.g., arestaurant, mall, airport, hotel, a retail or other store), one or morevehicles (e.g., a sports utility vehicle, bus, car, a ship, a boat, anairplane), multiple environments (e.g., a combination of home andvehicle environments), and/or another suitable environment wheremulti-zone audio may be desirable.

Within these rooms and spaces, the MPS 100 includes one or morecomputing devices. Referring to FIGS. 1A and 1B together, such computingdevices can include playback devices 102 (identified individually asplayback devices 102 a-102 o), network microphone devices 103(identified individually as “NMDs” 103 a-102 i), and controller devices104 a and 104 b (collectively “controller devices 104”). Referring toFIG. 1B, the home environment may include additional and/or othercomputing devices, including local network devices, such as one or moresmart illumination devices 108 (FIG. 1B), a smart thermostat 110, and alocal computing device 105 (FIG. 1A). In embodiments described below,one or more of the various playback devices 102 may be configured asportable playback devices, while others may be configured as stationaryplayback devices. For example, the headphones 102 o (FIG. 1B) are aportable playback device, while the playback device 102 d on thebookcase may be a stationary device. As another example, the playbackdevice 102 c on the Patio may be a battery-powered device, which mayallow it to be transported to various areas within the environment 101,and outside of the environment 101, when it is not plugged in to a walloutlet or the like.

With reference still to FIG. 1B, the various playback, networkmicrophone, and controller devices 102-104 and/or other network devicesof the MPS 100 may be coupled to one another via point-to-pointconnections and/or over other connections, which may be wired and/orwireless, via a local network 111 that may include a network router 109.For example, the playback device 102 j in the Den 101 d (FIG. 1A), whichmay be designated as the “Left” device, may have a point-to-pointconnection with the playback device 102 a, which is also in the Den 101d and may be designated as the “Right” device. In a related embodiment,the Left playback device 102 j may communicate with other networkdevices, such as the playback device 102 b, which may be designated asthe “Front” device, via a point-to-point connection and/or otherconnections via the local network 111. The local network 111 may be, forexample, a network that interconnects one or more devices within alimited area (e.g., a residence, an office building, a car, anindividual's workspace, etc.). The local network 111 may include, forexample, one or more local area network (LANs) such as wireless localarea networks (WLANs) (e.g., WIFI networks, Z-WAVE networks, etc.)and/or one or more personal area networks (PANs) such as BLUETOOTHnetworks, wireless USB networks, ZIGBEE networks, and IRDA networks.

As further shown in FIG. 1B, the MPS 100 may be coupled to one or moreremote computing devices 106 via a wide area network (“WAN”) 107. Insome embodiments, each remote computing device 106 may take the form ofone or more cloud servers. The remote computing devices 106 may beconfigured to interact with computing devices in the environment 101 invarious ways. For example, the remote computing devices 106 may beconfigured to facilitate streaming and/or controlling playback of mediacontent, such as audio, in the home environment 101.

In some implementations, the various playback devices, NMDs, and/orcontroller devices 102-104 may be communicatively coupled to at leastone remote computing device associated with a voice assistant service(“VAS”) and at least one remote computing device associated with a mediacontent service (“MCS”). For instance, in the illustrated example ofFIG. 1B, remote computing devices 106 a are associated with a VAS 190and remote computing devices 106 b are associated with an MCS 192.Although only a single VAS 190 and a single MCS 192 are shown in theexample of FIG. 1B for purposes of clarity, the MPS 100 may be coupledto multiple, different VASes and/or MCSes. In some implementations,VASes may be operated by one or more of AMAZON, GOOGLE, APPLE,MICROSOFT, NUANCE, SONOS or other voice assistant providers. In someimplementations, MCSes may be operated by one or more of SPOTIFY,PANDORA, AMAZON MUSIC, or other media content services.

As further shown in FIG. 1B, the remote computing devices 106 furtherinclude remote computing device 106 c configured to perform certainoperations, such as remotely facilitating media playback functions,managing device and system status information, directing communicationsbetween the devices of the MPS 100 and one or multiple VASes and/orMCSes, among other operations. In one example, the remote computingdevices 106 c provide cloud servers for one or more SONOS Wireless HiFiSystems.

In various implementations, one or more of the playback devices 102 maytake the form of or include an on-board (e.g., integrated) networkmicrophone device. For example, the playback devices 102 a-e include orare otherwise equipped with corresponding NMDs 103 a-e, respectively. Aplayback device that includes or is equipped with an NMD may be referredto herein interchangeably as a playback device or an NMD unlessindicated otherwise in the description. In some cases, one or more ofthe NMDs 103 may be a stand-alone device. For example, the NMDs 103 fand 103 g may be stand-alone devices. A stand-alone NMD may omitcomponents and/or functionality that is typically included in a playbackdevice, such as a speaker or related electronics. For instance, in suchcases, a stand-alone NMD may not produce audio output or may producelimited audio output (e.g., relatively low-quality audio output).

The various playback and network microphone devices 102 and 103 of theMPS 100 may each be associated with a unique name, which may be assignedto the respective devices by a user, such as during setup of one or moreof these devices. For instance, as shown in the illustrated example ofFIG. 1B, a user may assign the name “Bookcase” to playback device 102 dbecause it is physically situated on a bookcase. Similarly, the NMD 103f may be assigned the named “Island” because it is physically situatedon an island countertop in the Kitchen 101 h (FIG. 1A). Some playbackdevices may be assigned names according to a zone or room, such as theplayback devices 102 e, 1021, 102 m, and 102 n, which are named“Bedroom,” “Dining Room,” “Living Room,” and “Office,” respectively.Further, certain playback devices may have functionally descriptivenames. For example, the playback devices 102 a and 102 b are assignedthe names “Right” and “Front,” respectively, because these two devicesare configured to provide specific audio channels during media playbackin the zone of the Den 101 d (FIG. 1A). The playback device 102 c in thePatio may be named portable because it is battery-powered and/or readilytransportable to different areas of the environment 101. Other namingconventions are possible.

As discussed above, an NMD may detect and process sound from itsenvironment, such as sound that includes background noise mixed withspeech spoken by a person in the NMD's vicinity. For example, as soundsare detected by the NMD in the environment, the NMD may process thedetected sound to determine if the sound includes speech that containsvoice input intended for the NMD and ultimately a particular VAS. Forexample, the NMD may identify whether speech includes a wake wordassociated with a particular VAS.

In the illustrated example of FIG. 1B, the NMDs 103 are configured tointeract with the VAS 190 over the local network 111 and/or the router109. Interactions with the VAS 190 may be initiated, for example, whenan NMD identifies in the detected sound a potential wake word. Theidentification causes a wake-word event, which in turn causes the NMD tobegin transmitting detected-sound data to the VAS 190. In someimplementations, the various local network devices 102-105 (FIG. 1A)and/or remote computing devices 106 c of the MPS 100 may exchangevarious feedback, information, instructions, and/or related data withthe remote computing devices associated with the selected VAS. Suchexchanges may be related to or independent of transmitted messagescontaining voice inputs. In some embodiments, the remote computingdevice(s) and the media playback system 100 may exchange data viacommunication paths as described herein and/or using a metadata exchangechannel as described in U.S. Patent Publication No. 2017-0242653published Aug. 24, 2017, and titled “Voice Control of a Media PlaybackSystem,” which is herein incorporated by reference in its entirety.

Upon receiving the stream of sound data, the VAS 190 determines if thereis voice input in the streamed data from the NMD, and if so the VAS 190will also determine an underlying intent in the voice input. The VAS 190may next transmit a response back to the MPS 100, which can includetransmitting the response directly to the NMD that caused the wake-wordevent. The response is typically based on the intent that the VAS 190determined was present in the voice input. As an example, in response tothe VAS 190 receiving a voice input with an utterance to “Play Hey Judeby The Beatles,” the VAS 190 may determine that the underlying intent ofthe voice input is to initiate playback and further determine thatintent of the voice input is to play the particular song “Hey Jude.”After these determinations, the VAS 190 may transmit a command to aparticular MCS 192 to retrieve content (i.e., the song “Hey Jude”), andthat MCS 192, in turn, provides (e.g., streams) this content directly tothe MPS 100 or indirectly via the VAS 190. In some implementations, theVAS 190 may transmit to the MPS 100 a command that causes the MPS 100itself to retrieve the content from the MCS 192.

In certain implementations, NMDs may facilitate arbitration amongst oneanother when voice input is identified in speech detected by two or moreNMDs located within proximity of one another. For example, theNMD-equipped playback device 102 d in the environment 101 (FIG. 1A) isin relatively close proximity to the NMD-equipped Living Room playbackdevice 102 m, and both devices 102 d and 102 m may at least sometimesdetect the same sound. In such cases, this may require arbitration as towhich device is ultimately responsible for providing detected-sound datato the remote VAS. Examples of arbitrating between NMDs may be found,for example, in previously referenced U.S. Patent Publication No.2017-0242653.

In certain implementations, an NMD may be assigned to, or otherwiseassociated with, a designated or default playback device that may notinclude an NMD. For example, the Island NMD 103 f in the Kitchen 101 h(FIG. 1A) may be assigned to the Dining Room playback device 102 l,which is in relatively close proximity to the Island NMD 103 f. Inpractice, an NMD may direct an assigned playback device to play audio inresponse to a remote VAS receiving a voice input from the NMD to playthe audio, which the NMD might have sent to the VAS in response to auser speaking a command to play a certain song, album, playlist, etc.Additional details regarding assigning NMDs and playback devices asdesignated or default devices may be found, for example, in previouslyreferenced U.S. Patent Publication No. 2017-0242653.

Further aspects relating to the different components of the example MPS100 and how the different components may interact to provide a user witha media experience may be found in the following sections. Whilediscussions herein may generally refer to the example MPS 100,technologies described herein are not limited to applications within,among other things, the home environment described above. For instance,the technologies described herein may be useful in other homeenvironment configurations comprising more or fewer of any of theplayback, network microphone, and/or controller devices 102-104. Forexample, the technologies herein may be utilized within an environmenthaving a single playback device 102 and/or a single NMD 103. In someexamples of such cases, the local network 111 (FIG. 1B) may beeliminated and the single playback device 102 and/or the single NMD 103may communicate directly with the remote computing devices 106 a-d. Insome embodiments, a telecommunication network (e.g., an LTE network, a5G network, etc.) may communicate with the various playback, networkmicrophone, and/or controller devices 102-104 independent of the localnetwork 111.

While specific implementations of MPS's have been described above withrespect to FIGS. 1A and 1B, there are numerous configurations of MPS's,including, but not limited to, those that do not interact with remoteservices, systems that do not include controllers, and/or any otherconfiguration as appropriate to the requirements of a given application.

a. Example Playback & Network Microphone Devices

FIG. 2A is a functional block diagram illustrating certain aspects ofone of the playback devices 102 of the MPS 100 of FIGS. 1A and 1B. Asshown, the playback device 102 includes various components, each ofwhich is discussed in further detail below, and the various componentsof the playback device 102 may be operably coupled to one another via asystem bus, communication network, or some other connection mechanism.In the illustrated example of FIG. 2A, the playback device 102 may bereferred to as an “NMD-equipped” playback device because it includescomponents that support the functionality of an NMD, such as one of theNMDs 103 shown in FIG. 1A.

As shown, the playback device 102 includes at least one processor 212,which may be a clock-driven computing component configured to processinput data according to instructions stored in memory 213. The memory213 may be a tangible, non-transitory, computer-readable mediumconfigured to store instructions that are executable by the processor212. For example, the memory 213 may be data storage that can be loadedwith software code 214 that is executable by the processor 212 toachieve certain functions.

In one example, these functions may involve the playback device 102retrieving audio data from an audio source, which may be anotherplayback device. In another example, the functions may involve theplayback device 102 sending audio data, detected-sound data (e.g.,corresponding to a voice input), and/or other information to anotherdevice on a network via at least one network interface 224. In yetanother example, the functions may involve the playback device 102causing one or more other playback devices to synchronously playbackaudio with the playback device 102. In yet a further example, thefunctions may involve the playback device 102 facilitating being pairedor otherwise bonded with one or more other playback devices to create amulti-channel audio environment. Numerous other example functions arepossible, some of which are discussed below.

As just mentioned, certain functions may involve the playback device 102synchronizing playback of audio content with one or more other playbackdevices. During synchronous playback, a listener may not perceivetime-delay differences between playback of the audio content by thesynchronized playback devices. U.S. Pat. No. 8,234,395 filed on Apr. 4,2004, and titled “System and method for synchronizing operations among aplurality of independently clocked digital data processing devices,”which is hereby incorporated by reference in its entirety, provides inmore detail some examples for audio playback synchronization amongplayback devices.

To facilitate audio playback, the playback device 102 includes audioprocessing components 216 that are generally configured to process audioprior to the playback device 102 rendering the audio. In this respect,the audio processing components 216 may include one or moredigital-to-analog converters (“DAC”), one or more audio preprocessingcomponents, one or more audio enhancement components, one or moredigital signal processors (“DSPs”), and so on. In some implementations,one or more of the audio processing components 216 may be a subcomponentof the processor 212. In operation, the audio processing components 216receive analog and/or digital audio and process and/or otherwiseintentionally alter the audio to produce audio signals for playback.

The produced audio signals may then be provided to one or more audioamplifiers 217 for amplification and playback through one or morespeakers 218 operably coupled to the amplifiers 217. The audioamplifiers 217 may include components configured to amplify audiosignals to a level for driving one or more of the speakers 218.

Each of the speakers 218 may include an individual transducer (e.g., a“driver”) or the speakers 218 may include a complete speaker systeminvolving an enclosure with one or more drivers. A particular driver ofa speaker 218 may include, for example, a subwoofer (e.g., for lowfrequencies), a mid-range driver (e.g., for middle frequencies), and/ora tweeter (e.g., for high frequencies). In some cases, a transducer maybe driven by an individual corresponding audio amplifier of the audioamplifiers 217. In some implementations, a playback device may notinclude the speakers 218, but instead may include a speaker interfacefor connecting the playback device to external speakers. In certainembodiments, a playback device may include neither the speakers 218 northe audio amplifiers 217, but instead may include an audio interface(not shown) for connecting the playback device to an external audioamplifier or audio-visual receiver.

In addition to producing audio signals for playback by the playbackdevice 102, the audio processing components 216 may be configured toprocess audio to be sent to one or more other playback devices, via thenetwork interface 224, for playback. In example scenarios, audio contentto be processed and/or played back by the playback device 102 may bereceived from an external source, such as via an audio line-in interface(e.g., an auto-detecting 3.5 mm audio line-in connection) of theplayback device 102 (not shown) or via the network interface 224, asdescribed below.

As shown, the at least one network interface 224, may take the form ofone or more wireless interfaces 225 and/or one or more wired interfaces226. A wireless interface may provide network interface functions forthe playback device 102 to wirelessly communicate with other devices(e.g., other playback device(s), NMD(s), and/or controller device(s)) inaccordance with a communication protocol (e.g., any wireless standardincluding IEEE 802.11a, 802.11b, 802.11g, 802.11n, 802.11ac, 802.11ad,802.11af, 802.11ah, 802.11ai, 802.11aj, 802.11aq, 802.11ax, 802.11ay,802.15, BLUETOOTH, 4G mobile communication standard, 5G mobilecommunication standard, and so on). A wired interface may providenetwork interface functions for the playback device 102 to communicateover a wired connection with other devices in accordance with acommunication protocol (e.g., IEEE 802.3). While the network interface224 shown in FIG. 2A includes both wired and wireless interfaces, theplayback device 102 may in some implementations include only wirelessinterface(s) or only wired interface(s).

In general, the network interface 224 facilitates data flow between theplayback device 102 and one or more other devices on a data network. Forinstance, the playback device 102 may be configured to receive audiocontent over the data network from one or more other playback devices,network devices within a LAN, and/or audio content sources over a WAN,such as the Internet. In one example, the audio content and othersignals transmitted and received by the playback device 102 may betransmitted in the form of digital packet data comprising an InternetProtocol (IP)-based source address and IP-based destination addresses.In such a case, the network interface 224 may be configured to parse thedigital packet data such that the data destined for the playback device102 is properly received and processed by the playback device 102.

As shown in FIG. 2A, the playback device 102 also includes voiceprocessing components 220 that are operably coupled to one or moremicrophones 222. The microphones 222 are configured to detect sound(i.e., acoustic waves) in the environment of the playback device 102,which is then provided to the voice processing components 220. Morespecifically, each microphone 222 is configured to detect sound andconvert the sound into a digital or analog signal representative of thedetected sound, which can then cause the voice processing component 220to perform various functions based on the detected sound, as describedin greater detail below. In one implementation, the microphones 222 arearranged as an array of microphones (e.g., an array of six microphones).In some implementations, the playback device 102 includes more than sixmicrophones (e.g., eight microphones or twelve microphones) or fewerthan six microphones (e.g., four microphones, two microphones, or asingle microphones).

In operation, the voice-processing components 220 are generallyconfigured to detect and process sound received via the microphones 222,identify potential voice input in the detected sound, and extractdetected-sound data to enable a VAS, such as the VAS 190 (FIG. 1B), toprocess voice input identified in the detected-sound data. The voiceprocessing components 220 may include one or more analog-to-digitalconverters, an acoustic echo canceller (“AEC”), a spatial processor(e.g., one or more multi-channel Wiener filters, one or more otherfilters, and/or one or more beam former components), one or more buffers(e.g., one or more circular buffers), one or more wake-word engines, oneor more voice extractors, and/or one or more speech processingcomponents (e.g., components configured to recognize a voice of aparticular user or a particular set of users associated with ahousehold), among other example voice processing components. In exampleimplementations, the voice processing components 220 may include orotherwise take the form of one or more DSPs or one or more modules of aDSP. In this respect, certain voice processing components 220 may beconfigured with particular parameters (e.g., gain and/or spectralparameters) that may be modified or otherwise tuned to achieveparticular functions. In some implementations, one or more of the voiceprocessing components 220 may be a subcomponent of the processor 212.

In some implementations, the voice-processing components 220 may detectand store a user's voice profile, which may be associated with a useraccount of the MPS 100. For example, voice profiles may be stored asand/or compared to variables stored in a set of command information ordata table. The voice profile may include aspects of the tone orfrequency of a user's voice and/or other unique aspects of the user'svoice, such as those described in previously-referenced U.S. PatentPublication No. 2017-0242653.

As further shown in FIG. 2A, the playback device 102 also includes powercomponents 227. The power components 227 may include at least anexternal power source interface 228, which may be coupled to a powersource (not shown) via a power cable or the like that physicallyconnects the playback device 102 to an electrical outlet or some otherexternal power source. Other power components may include, for example,transformers, converters, and like components configured to formatelectrical power.

In some implementations, the power components 227 of the playback device102 may additionally include an internal power source 229 (e.g., one ormore batteries) configured to power the playback device 102 without aphysical connection to an external power source. When equipped with theinternal power source 229, the playback device 102 may operateindependent of an external power source. In some such implementations,the external power source interface 228 may be configured to facilitatecharging the internal power source 229. As discussed before, a playbackdevice comprising an internal power source may be referred to herein asa “portable playback device.” Those portable playback devices that weighno more than fifty ounces (e.g., between three ounces and fifty ounces,between five ounces and fifty ounces, between ten ounces and fiftyounces, between ten ounces and twenty-five ounces, etc.) may be referredto herein as an “ultra-portable playback device.” Those playback devicesthat operate using an external power source instead of an internal powersource may be referred to herein as a “stationary playback device,”although such a device may in fact be moved around a home or otherenvironment.

The playback device 102 may further include a user interface 240 thatmay facilitate user interactions independent of or in conjunction withuser interactions facilitated by one or more of the controller devices104. In various embodiments, the user interface 240 includes one or morephysical buttons and/or supports graphical interfaces provided on touchsensitive screen(s) and/or surface(s), among other possibilities, for auser to directly provide input. The user interface 240 may furtherinclude one or more of lights (e.g., LEDs) and the speakers to providevisual and/or audio feedback to a user.

As an illustrative example, FIG. 2B shows an example housing 230 of theplayback device 102 that includes a user interface in the form of acontrol area 232 at a top portion 234 of the housing 230. The controlarea 232 includes buttons 236 a-c for controlling audio playback, volumelevel, and other functions. The control area 232 also includes a button236 d for toggling the microphones 222 to either an on state or an offstate.

As further shown in FIG. 2B, the control area 232 is at least partiallysurrounded by apertures formed in the top portion 234 of the housing 230through which the microphones 222 (not visible in FIG. 2B) receive thesound in the environment of the playback device 102. The microphones 222may be arranged in various positions along and/or within the top portion234 or other areas of the housing 230 so as to detect sound from one ormore directions relative to the playback device 102.

As mentioned above, the playback device 102 may be constructed as aportable playback device, such as an ultra-portable playback device,that comprises an internal power source. FIG. 2C shows an examplehousing 240 for such a portable playback device. As shown, the housing240 of the portable playback device includes a user interface in theform of a control area 242 at a top portion 244 of the housing 240. Thecontrol area 242 may include a capacitive touch sensor for controllingaudio playback, volume level, and other functions. The housing 240 ofthe portable playback device may be configured to engage with a chargingdock 246 that is connected to an external power source via cable 248.The charging dock 246 may be configured to provide power to the portableplayback device to recharge an internal battery. In some embodiments,the charging dock 246 may comprise a set of one or more conductivecontacts (not shown) positioned on the top of the charging dock 246 thatengage with conductive contacts on the bottom of the housing 240 (notshown). In other embodiments, the charging dock 246 may provide powerfrom the cable 248 to the portable playback device without the use ofconductive contacts. For example, the charging dock 246 may wirelesslycharge the portable playback device via one or more inductive coilsintegrated into each of the charging dock 246 and the portable playbackdevice.

In some embodiments, the playback device 102 may take the form of awired and/or wireless headphone (e.g., an over-ear headphone, an on-earheadphone, or an in-ear headphone). For instance, FIG. 2D shows anexample housing 250 for such an implementation of the playback device102. As shown, the housing 250 includes a headband 252 that couples afirst earpiece 254 a to a second earpiece 254 b. Each of the earpieces254 a and 254 b may house any portion of the electronic components inthe playback device, such as one or more speakers. Further, one or moreof the earpieces 254 a and 254 b may include a control area 258configured to receive a user control indication for controlling audioplayback, volume level, and other functions. The control area 258 maycomprise any combination of the following: a capacitive touch sensor, abutton, a switch, and a dial. As shown in FIG. 2D, the housing 250 mayfurther include ear cushions 256 a and 256 b that are coupled toearpieces 254 a and 254 b, respectively. The ear cushions 256 a and 256b may provide a soft barrier between the head of a user and theearpieces 254 a and 254 b, respectively, to improve user comfort and/orprovide acoustic isolation from the ambient (e.g., passive noisereduction (PNR)). In some implementations, the wired and/or wirelessheadphones may be ultra-portable playback devices that are powered by aninternal energy source and weigh less than fifty ounces.

In some instances, the headphone device may take the form of a hearabledevice. Hearable devices may include those headphone devices (e.g.,ear-level devices) that are configured to provide a hearing enhancementfunction while also supporting playback of media content (e.g.,streaming media content from a user device over a PAN, streaming mediacontent from a streaming music service provider over a WLAN and/or acellular network connection, etc.). In some instances, a hearable devicemay be implemented as an in-ear headphone device that is configured toplayback an amplified version of at least some sounds detected from anexternal environment (e.g., all sound, select sounds such as humanspeech, etc.).

It should be appreciated that the playback device 102 may take the formof other wearable devices separate and apart from a headphone. Wearabledevices may include those devices configured to be worn about a portionof a subject (e.g., an ear, a head, a neck, a torso, an arm, a wrist, afinger, a leg, an ankle, etc.). For example, the playback device 102 maytake the form of a pair of glasses including a frame front (e.g.,configured to hold one or more lenses), a first temple rotatably coupledto the frame front, and a second temple rotatable coupled to the framefront. In this example, the pair of glasses may comprise one or moretransducers integrated into at least one of the first and second templesand configured to project sound towards an ear of the subject. Whilespecific implementations of playback and network microphone devices havebeen described above with respect to FIGS. 2A, 2B, 2C, and 2D, there arenumerous configurations of devices, including, but not limited to, thosehaving no UI, microphones in different locations, multiple microphonearrays positioned in different arrangements, and/or any otherconfiguration as appropriate to the requirements of a given application.For example, UIs and/or microphone arrays can be implemented in otherplayback devices and/or computing devices rather than those describedherein. Further, although a specific example of playback device 102 isdescribed with reference to MPS 100, one skilled in the art willrecognize that playback devices as described herein can be used in avariety of different environments, including (but not limited to)environments with more and/or fewer elements, without departing fromthis invention. Likewise, MPS's as described herein can be used withvarious different playback devices.

By way of illustration, SONOS, Inc. presently offers (or has offered)for sale certain playback devices that may implement certain of theembodiments disclosed herein, including a “SONOS ONE,” “PLAY:1,”“PLAY:3,” “PLAY:5,” “PLAYBAR,” “AMP,” “CONNECT:AMP,” “PLAYBASE,” “BEAM,”“CONNECT,” and “SUB.” Any other past, present, and/or future playbackdevices may additionally or alternatively be used to implement theplayback devices of example embodiments disclosed herein. Additionally,it should be understood that a playback device is not limited to theexamples illustrated in FIG. 2A, 2B, 2C, or 2D or to the SONOS productofferings. For example, a playback device may be integral to anotherdevice or component such as a television, a lighting fixture, or someother device for indoor or outdoor use.

b. Example Playback Device Configurations

FIGS. 3A-3E show example configurations of playback devices. Referringfirst to FIG. 3A, in some example instances, a single playback devicemay belong to a zone. For example, the playback device 102 c (FIG. 1A)on the Patio may belong to Zone A. In some implementations describedbelow, multiple playback devices may be “bonded” to form a “bondedpair,” which together form a single zone. For example, the playbackdevice 102 f (FIG. 1A) named “Bed 1” in FIG. 3A may be bonded to theplayback device 102 g (FIG. 1A) named “Bed 2” in FIG. 3A to form Zone B.Bonded playback devices may have different playback responsibilities(e.g., channel responsibilities). In another implementation describedbelow, multiple playback devices may be merged to form a single zone.For example, the playback device 102 d named “Bookcase” may be mergedwith the playback device 102 m named “Living Room” to form a single ZoneC. The merged playback devices 102 d and 102 m may not be specificallyassigned different playback responsibilities. That is, the mergedplayback devices 102 d and 102 m may, aside from playing audio contentin synchrony, each play audio content as they would if they were notmerged.

For purposes of control, each zone in the MPS 100 may be represented asa single user interface (“UI”) entity. For example, as displayed by thecontroller devices 104, Zone A may be provided as a single entity named“Portable,” Zone B may be provided as a single entity named “Stereo,”and Zone C may be provided as a single entity named “Living Room.”

In various embodiments, a zone may take on the name of one of theplayback devices belonging to the zone. For example, Zone C may take onthe name of the Living Room device 102 m (as shown). In another example,Zone C may instead take on the name of the Bookcase device 102 d. In afurther example, Zone C may take on a name that is some combination ofthe Bookcase device 102 d and Living Room device 102 m. The name that ischosen may be selected by a user via inputs at a controller device 104.In some embodiments, a zone may be given a name that is different thanthe device(s) belonging to the zone. For example, Zone B in FIG. 3A isnamed “Stereo” but none of the devices in Zone B have this name. In oneaspect, Zone B is a single UI entity representing a single device named“Stereo,” composed of constituent devices “Bed 1” and “Bed 2.” In oneimplementation, the Bed 1 device may be playback device 102 f in themaster bedroom 101 h (FIG. 1A) and the Bed 2 device may be the playbackdevice 102 g also in the master bedroom 101 h (FIG. 1A).

As noted above, playback devices that are bonded may have differentplayback responsibilities, such as playback responsibilities for certainaudio channels. For example, as shown in FIG. 3B, the Bed 1 and Bed 2devices 102 f and 102 g may be bonded so as to produce or enhance astereo effect of audio content. In this example, the Bed 1 playbackdevice 102 f may be configured to play a left channel audio component,while the Bed 2 playback device 102 g may be configured to play a rightchannel audio component. In some implementations, such stereo bondingmay be referred to as “pairing.”

Additionally, playback devices that are configured to be bonded may haveadditional and/or different respective speaker drivers. As shown in FIG.3C, the playback device 102 b named “Front” may be bonded with theplayback device 102 k named “SUB.” The Front device 102 b may render arange of mid to high frequencies, and the SUB device 102 k may renderlow frequencies as, for example, a subwoofer. When unbonded, the Frontdevice 102 b may be configured to render a full range of frequencies. Asanother example, FIG. 3D shows the Front and SUB devices 102 b and 102 kfurther bonded with Right and Left playback devices 102 a and 102 j,respectively. In some implementations, the Right and Left devices 102 aand 102 j may form surround or “satellite” channels of a home theatersystem. The bonded playback devices 102 a, 102 b, 102 j, and 102 k mayform a single Zone D (FIG. 3A).

In some implementations, playback devices may also be “merged.” Incontrast to certain bonded playback devices, playback devices that aremerged may not have assigned playback responsibilities, but may eachrender the full range of audio content that each respective playbackdevice is capable of. Nevertheless, merged devices may be represented asa single UI entity (i.e., a zone, as discussed above). For instance,FIG. 3E shows the playback devices 102 d and 102 m in the Living Roommerged, which would result in these devices being represented by thesingle UI entity of Zone C. In one embodiment, the playback devices 102d and 102 m may playback audio in synchrony, during which each outputsthe full range of audio content that each respective playback device 102d and 102 m is capable of rendering.

In some embodiments, a stand-alone NMD may be in a zone by itself. Forexample, the NMD 103 h from FIG. 1A is named “Closet” and forms Zone Iin FIG. 3A. An NMD may also be bonded or merged with another device soas to form a zone. For example, the NMD device 103 f named “Island” maybe bonded with the playback device 102 i Kitchen, which together formZone F, which is also named “Kitchen.” Additional details regardingassigning NMDs and playback devices as designated or default devices maybe found, for example, in previously referenced U.S. Patent PublicationNo. 2017-0242653. In some embodiments, a stand-alone NMD may not beassigned to a zone.

Zones of individual, bonded, and/or merged devices may be arranged toform a set of playback devices that playback audio in synchrony. Such aset of playback devices may be referred to as a “group,” “zone group,”“synchrony group,” or “playback group.” In response to inputs providedvia a controller device 104, playback devices may be dynamically groupedand ungrouped to form new or different groups that synchronously playback audio content. For example, referring to FIG. 3A, Zone A may begrouped with Zone B to form a zone group that includes the playbackdevices of the two zones. As another example, Zone A may be grouped withone or more other Zones C-I. The Zones A-I may be grouped and ungroupedin numerous ways. For example, three, four, five, or more (e.g., all) ofthe Zones A-I may be grouped. When grouped, the zones of individualand/or bonded playback devices may play back audio in synchrony with oneanother, as described in previously referenced U.S. Pat. No. 8,234,395.Grouped and bonded devices are example types of associations betweenportable and stationary playback devices that may be caused in responseto a trigger event, as discussed above and described in greater detailbelow.

In various implementations, the zones in an environment may be assigneda particular name, which may be the default name of a zone within a zonegroup or a combination of the names of the zones within a zone group,such as “Dining Room+Kitchen,” as shown in FIG. 3A. In some embodiments,a zone group may be given a unique name selected by a user, such as“Nick's Room,” as also shown in FIG. 3A. The name “Nick's Room” may be aname chosen by a user over a prior name for the zone group, such as theroom name “Master Bedroom.”

Referring back to FIG. 2A, certain data may be stored in the memory 213as one or more state variables that are periodically updated and used todescribe the state of a playback zone, the playback device(s), and/or azone group associated therewith. The memory 213 may also include thedata associated with the state of the other devices of the mediaplayback system 100, which may be shared from time to time among thedevices so that one or more of the devices have the most recent dataassociated with the system.

In some embodiments, the memory 213 of the playback device 102 may storeinstances of various variable types associated with the states.Variables instances may be stored with identifiers (e.g., tags)corresponding to type. For example, certain identifiers may be a firsttype “a1” to identify playback device(s) of a zone, a second type “b1”to identify playback device(s) that may be bonded in the zone, and athird type “c1” to identify a zone group to which the zone may belong.As a related example, in FIG. 1A, identifiers associated with the Patiomay indicate that the Patio is the only playback device of a particularzone and not in a zone group. Identifiers associated with the LivingRoom may indicate that the Living Room is not grouped with other zonesbut includes bonded playback devices 102 a, 102 b, 102 j, and 102 k.Identifiers associated with the Dining Room may indicate that the DiningRoom is part of Dining Room+Kitchen group and that devices 103 f and 102i are bonded. Identifiers associated with the Kitchen may indicate thesame or similar information by virtue of the Kitchen being part of theDining Room+Kitchen zone group. Other example zone variables andidentifiers are described below.

In yet another example, the MPS 100 may include variables or identifiersrepresenting other associations of zones and zone groups, such asidentifiers associated with Areas, as shown in FIG. 3A. An Area mayinvolve a cluster of zone groups and/or zones not within a zone group.For instance, FIG. 3A shows a first area named “First Area” and a secondarea named “Second Area.” The First Area includes zones and zone groupsof the Patio, Den, Dining Room, Kitchen, and Bathroom. The Second Areaincludes zones and zone groups of the Bathroom, Nick's Room, Bedroom,and Living Room. In one aspect, an Area may be used to invoke a clusterof zone groups and/or zones that share one or more zones and/or zonegroups of another cluster. In this respect, such an Area differs from azone group, which does not share a zone with another zone group. Furtherexamples of techniques for implementing Areas may be found, for example,in U.S. Patent Publication No. 2018-0107446 published Apr. 19, 2018 andtitled “Room Association Based on Name,” and U.S. Pat. No. 8,483,853filed Sep. 11, 2007, and titled “Controlling and manipulating groupingsin a multi-zone media system,” each of which is incorporated herein byreference in its entirety. In some embodiments, the MPS 100 may notimplement Areas, in which case the system may not store variablesassociated with Areas.

The memory 213 may be further configured to store other data. Such datamay pertain to audio sources accessible by the playback device 102 or aplayback queue that the playback device (or some other playbackdevice(s)) may be associated with. In embodiments described below, thememory 213 is configured to store a set of command data for selecting aparticular VAS when processing voice inputs.

During operation, one or more playback zones in the environment of FIG.1A may each be playing different audio content. For instance, the usermay be grilling in the Patio zone and listening to hip hop music beingplayed by the playback device 102 c, while another user may be preparingfood in the Kitchen zone and listening to classical music being playedby the playback device 102 i. In another example, a playback zone mayplay the same audio content in synchrony with another playback zone. Forinstance, the user may be in the Office zone where the playback device102 n is playing the same hip-hop music that is being playing byplayback device 102 c in the Patio zone. In such a case, playbackdevices 102 c and 102 n may be playing the hip-hop in synchrony suchthat the user may seamlessly (or at least substantially seamlessly)enjoy the audio content that is being played out-loud while movingbetween different playback zones. Synchronization among playback zonesmay be achieved in a manner similar to that of synchronization amongplayback devices, as described in previously referenced U.S. Pat. No.8,234,395.

As suggested above, the zone configurations of the MPS 100 may bedynamically modified. As such, the MPS 100 may support numerousconfigurations. For example, if a user physically moves one or moreplayback devices to or from a zone, the MPS 100 may be reconfigured toaccommodate the change(s). For instance, if the user physically movesthe playback device 102 c from the Patio zone to the Office zone, theOffice zone may now include both the playback devices 102 c and 102 n.In some cases, the user may pair or group the moved playback device 102c with the Office zone and/or rename the players in the Office zoneusing, for example, one of the controller devices 104 and/or voiceinput. As another example, if one or more playback devices 102 are movedto a particular space in the home environment that is not already aplayback zone, the moved playback device(s) may be renamed or associatedwith a playback zone for the particular space.

Further, different playback zones of the MPS 100 may be dynamicallycombined into zone groups or split up into individual playback zones.For example, the Dining Room zone and the Kitchen zone may be combinedinto a zone group for a dinner party such that playback devices 102 iand 102 l may render audio content in synchrony. As another example,bonded playback devices in the Den zone may be split into (i) atelevision zone and (ii) a separate listening zone. The television zonemay include the Front playback device 102 b. The listening zone mayinclude the Right, Left, and SUB playback devices 102 a, 102 j, and 102k, which may be grouped, paired, or merged, as described above.Splitting the Den zone in such a manner may allow one user to listen tomusic in the listening zone in one area of the living room space, andanother user to watch the television in another area of the living roomspace. In a related example, a user may utilize either of the NMD 103 aor 103 b (FIG. 1B) to control the Den zone before it is separated intothe television zone and the listening zone. Once separated, thelistening zone may be controlled, for example, by a user in the vicinityof the NMD 103 a, and the television zone may be controlled, forexample, by a user in the vicinity of the NMD 103 b. As described above,however, any of the NMDs 103 may be configured to control the variousplayback and other devices of the MPS 100.

c. Example Controller Devices

FIG. 4A is a functional block diagram illustrating certain aspects of aselected one of the controller devices 104 of the MPS 100 of FIG. 1A.Controller devices in accordance with several embodiments of theinvention can be used in various systems, such as (but not limited to)an MPS as described in FIG. 1A. Such controller devices may also bereferred to herein as a “control device” or “controller.” The controllerdevice shown in FIG. 4A may include components that are generallysimilar to certain components of the network devices described above,such as a processor 412, memory 413 storing program software 414, atleast one network interface 424, and one or more microphones 422. In oneexample, a controller device may be a dedicated controller for the MPS100. In another example, a controller device may be a network device onwhich media playback system controller application software may beinstalled, such as for example, an iPhone™, iPad™ or any other smartphone, tablet, or network device (e.g., a networked computer such as aPC or Mac™)

The memory 413 of the controller device 104 may be configured to storecontroller application software and other data associated with the MPS100 and/or a user of the system 100. The memory 413 may be loaded withinstructions in software 414 that are executable by the processor 412 toachieve certain functions, such as facilitating user access, control,and/or configuration of the MPS 100. The controller device 104 may beconfigured to communicate with other network devices via the networkinterface 424, which may take the form of a wireless interface, asdescribed above.

In one example, system information (e.g., such as a state variable) maybe communicated between the controller device 104 and other devices viathe network interface 424. For instance, the controller device 104 mayreceive playback zone and zone group configurations in the MPS 100 froma playback device, an NMD, or another network device. Likewise, thecontroller device 104 may transmit such system information to a playbackdevice or another network device via the network interface 424. In somecases, the other network device may be another controller device.

The controller device 104 may also communicate playback device controlcommands, such as volume control and audio playback control, to aplayback device via the network interface 424. As suggested above,changes to configurations of the MPS 100 may also be performed by a userusing the controller device 104. The configuration changes may includeadding/removing one or more playback devices to/from a zone,adding/removing one or more zones to/from a zone group, forming a bondedor merged player, separating one or more playback devices from a bondedor merged player, among others.

As shown in FIG. 4A, the controller device 104 may also include a userinterface 440 that is generally configured to facilitate user access andcontrol of the MPS 100. The user interface 440 may include atouch-screen display or other physical interface configured to providevarious graphical controller interfaces, such as the controllerinterfaces 440 a and 440 b shown in FIGS. 4B and 4C. Referring to FIGS.4B and 4C together, the controller interfaces 440 a and 440 b include aplayback control region 442, a playback zone region 443, a playbackstatus region 444, a playback queue region 446, and a sources region448. The user interface as shown is just one example of an interfacethat may be provided on a network device, such as the controller deviceshown in FIG. 4A, and accessed by users to control a media playbacksystem, such as the MPS 100. Other user interfaces of varying formats,styles, and interactive sequences may alternatively be implemented onone or more network devices to provide comparable control access to amedia playback system.

The playback control region 442 (FIG. 4B) may include selectable icons(e.g., by way of touch or by using a cursor) that, when selected, causeplayback devices in a selected playback zone or zone group to play orpause, fast forward, rewind, skip to next, skip to previous, enter/exitshuffle mode, enter/exit repeat mode, enter/exit cross fade mode, etc.The playback control region 442 may also include selectable icons that,when selected, modify equalization settings and/or playback volume,among other possibilities.

The playback zone region 443 (FIG. 4C) may include representations ofplayback zones within the MPS 100. The playback zones regions 443 mayalso include a representation of zone groups, such as the DiningRoom+Kitchen zone group, as shown. In some embodiments, the graphicalrepresentations of playback zones may be selectable to bring upadditional selectable icons to manage or configure the playback zones inthe MPS 100, such as a creation of bonded zones, creation of zonegroups, separation of zone groups, and renaming of zone groups, amongother possibilities.

For example, as shown, a “group” icon may be provided within each of thegraphical representations of playback zones. The “group” icon providedwithin a graphical representation of a particular zone may be selectableto bring up options to select one or more other zones in the MPS 100 tobe grouped with the particular zone. Once grouped, playback devices inthe zones that have been grouped with the particular zone will beconfigured to play audio content in synchrony with the playbackdevice(s) in the particular zone. Analogously, a “group” icon may beprovided within a graphical representation of a zone group. In thiscase, the “group” icon may be selectable to bring up options to deselectone or more zones in the zone group to be removed from the zone group.Other interactions and implementations for grouping and ungrouping zonesvia a user interface are also possible. The representations of playbackzones in the playback zone region 443 (FIG. 4C) may be dynamicallyupdated as playback zone or zone group configurations are modified.

The playback status region 444 (FIG. 4B) may include graphicalrepresentations of audio content that is presently being played,previously played, or scheduled to play next in the selected playbackzone or zone group. The selected playback zone or zone group may bevisually distinguished on a controller interface, such as within theplayback zone region 443 and/or the playback status region 444. Thegraphical representations may include track title, artist name, albumname, album year, track length, and/or other relevant information thatmay be useful for the user to know when controlling the MPS 100 via acontroller interface.

The playback queue region 446 may include graphical representations ofaudio content in a playback queue associated with the selected playbackzone or zone group. In some embodiments, each playback zone or zonegroup may be associated with a playback queue comprising informationcorresponding to zero or more audio items for playback by the playbackzone or zone group. For instance, each audio item in the playback queuemay comprise a uniform resource identifier (URI), a uniform resourcelocator (URL), or some other identifier that may be used by a playbackdevice in the playback zone or zone group to find and/or retrieve theaudio item from a local audio content source or a networked audiocontent source, which may then be played back by the playback device.

In one example, a playlist may be added to a playback queue, in whichcase information corresponding to each audio item in the playlist may beadded to the playback queue. In another example, audio items in aplayback queue may be saved as a playlist. In a further example, aplayback queue may be empty, or populated but “not in use” when theplayback zone or zone group is playing continuously streamed audiocontent, such as Internet radio that may continue to play untilotherwise stopped, rather than discrete audio items that have playbackdurations. In an alternative embodiment, a playback queue can includeInternet radio and/or other streaming audio content items and be “inuse” when the playback zone or zone group is playing those items. Otherexamples are also possible.

When playback zones or zone groups are “grouped” or “ungrouped,”playback queues associated with the affected playback zones or zonegroups may be cleared or re-associated. For example, if a first playbackzone including a first playback queue is grouped with a second playbackzone including a second playback queue, the established zone group mayhave an associated playback queue that is initially empty, that containsaudio items from the first playback queue (such as if the secondplayback zone was added to the first playback zone), that contains audioitems from the second playback queue (such as if the first playback zonewas added to the second playback zone), or a combination of audio itemsfrom both the first and second playback queues. Subsequently, if theestablished zone group is ungrouped, the resulting first playback zonemay be re-associated with the previous first playback queue or may beassociated with a new playback queue that is empty or contains audioitems from the playback queue associated with the established zone groupbefore the established zone group was ungrouped. Similarly, theresulting second playback zone may be re-associated with the previoussecond playback queue or may be associated with a new playback queuethat is empty or contains audio items from the playback queue associatedwith the established zone group before the established zone group wasungrouped. Other examples are also possible.

With reference still to FIGS. 4B and 4C, the graphical representationsof audio content in the playback queue region 446 (FIG. 4B) may includetrack titles, artist names, track lengths, and/or other relevantinformation associated with the audio content in the playback queue. Inone example, graphical representations of audio content may beselectable to bring up additional selectable icons to manage and/ormanipulate the playback queue and/or audio content represented in theplayback queue. For instance, a represented audio content may be removedfrom the playback queue, moved to a different position within theplayback queue, or selected to be played immediately, or after anycurrently playing audio content, among other possibilities. A playbackqueue associated with a playback zone or zone group may be stored in amemory on one or more playback devices in the playback zone or zonegroup, on a playback device that is not in the playback zone or zonegroup, and/or some other designated device. Playback of such a playbackqueue may involve one or more playback devices playing back media itemsof the queue, perhaps in sequential or random order.

The sources region 448 may include graphical representations ofselectable audio content sources and/or selectable voice assistantsassociated with a corresponding VAS. The VASes may be selectivelyassigned. In some examples, multiple VASes, such as AMAZON's Alexa,MICROSOFT' s Cortana, etc., may be invokable by the same NMD. In someembodiments, a user may assign a VAS exclusively to one or more NMDs.For example, a user may assign a first VAS to one or both of the NMDs102 a and 102 b in the Living Room shown in FIG. 1A, and a second VAS tothe NMD 103 f in the Kitchen. Other examples are possible.

d. Example Audio Content Sources

The audio sources in the sources region 448 may be audio content sourcesfrom which audio content may be retrieved and played by the selectedplayback zone or zone group. One or more playback devices in a zone orzone group may be configured to retrieve for playback audio content(e.g., according to a corresponding URI or URL for the audio content)from a variety of available audio content sources. In one example, audiocontent may be retrieved by a playback device directly from acorresponding audio content source (e.g., via a line-in connection). Inanother example, audio content may be provided to a playback device overa network via one or more other playback devices or network devices. Asdescribed in greater detail below, in some embodiments, audio contentmay be provided by one or more media content services.

Example audio content sources may include a memory of one or moreplayback devices in a media playback system such as the MPS 100 of FIG.1, local music libraries on one or more network devices (e.g., acontroller device, a network-enabled personal computer, or anetworked-attached storage (“NAS”)), streaming audio services providingaudio content via the Internet (e.g., cloud-based music services), oraudio sources connected to the media playback system via a line-in inputconnection on a playback device or network device, among otherpossibilities.

In some embodiments, audio content sources may be added or removed froma media playback system such as the MPS 100 of FIG. 1A. In one example,an indexing of audio items may be performed whenever one or more audiocontent sources are added, removed, or updated. Indexing of audio itemsmay involve scanning for identifiable audio items in allfolders/directories shared over a network accessible by playback devicesin the media playback system and generating or updating an audio contentdatabase comprising metadata (e.g., title, artist, album, track length,among others) and other associated information, such as a URI or URL foreach identifiable audio item found. Other examples for managing andmaintaining audio content sources may also be possible.

III. Example Distributed Processing Architectures

As discussed above, a distributed processor architecture may be employedin devices, such as playback devices or other IoT devices, tosignificantly reduce power consumption. For example, a high-powerprocessor that executes a GPOS may be employed, in at least somerespects, as a co-processor to a less powerful (and less power hungry)processor executing an SPOS. As a result, the high-power processor canbe completely powered off in situations where the functionality of thehigh-power processor is not needed without interrupting otheroperations, such as reading one or more capacitive touch sensors todetect audio playback commands, obtaining audio content via BLUETOOTH,and/or playing back the audio content. An example of a device employingsuch a distributed processing architecture is shown in FIG. 5 by device500. The device 500 may be implemented as any of a variety of devicesincluding any of the devices described herein (e.g., playback devices,NMDs, IoT devices, etc.).

As shown in FIG. 5, the device 500 comprises network interfacecomponent(s) 502 to facilitate communication with external devices. Thenetwork interface component(s) 502 include a first network circuit 520to facilitate communication with a first computing device 510 over afirst communication link 512 and may further include a second networkcircuit 522 to facilitate communication with a second computing device516 over a second communication link 518. The device 500 furtherincludes processing components 504 that are coupled to the networkinterface component(s) 502. The processing components 504 include firstprocessor(s) 524 that execute first operating system(s) 528 and secondprocessor(s) 526 that execute second operating system(s) 530. Theprocessing components 504 may execute instructions stored in datastorage 506 that may comprise a first memory 532 and a second memory534. The processing components 504 may communicate with (and/or control)electronic component(s) 508 directly or via intermediary component(s)514.

The network interface component(s) 502 may facilitate wirelesscommunication to one or more external devices shown as the firstcomputing device 510 and the second computing device 516. The networkinterface component(s) 502 may comprise the first network circuit 520that enables communication over the first communication link 512 using afirst communication protocol and a second network circuit 522 thatenables communication over the second communication link 518 using asecond, different communication protocol. For example, the first networkcircuit 520 may enable communication using an IEEE 802 protocol and/or acellular network protocol while the second network circuit 522 mayenable communication using another protocol, such as a BLUETOOTHprotocol. Thus, the network interface component(s) 502 may enablecommunication (e.g., simultaneous communication) with multiple computingdevices using different communication protocols.

In some embodiments, the first network circuit 520 may be implemented asa WIFI circuit (e.g., comprising a WIFI transceiver) that is configuredto communicate with the first computing device 510 over a WIFI network.In these embodiments, the first computing device 510 may be, forexample, a network router and/or a computing device that is accessibleover the Internet (e.g., a cloud server). Additionally (oralternatively), the second network circuit 522 may be implemented as aBLUETOOTH circuit (e.g., comprising a BLUETOOTH transceiver) that isconfigured to communicate with the second computing device 516 using aBLUETOOTH connection. In such instances, the second computing device 516may be, for example, a portable computing device such as a smartphone ora tablet.

The network circuits 520 and 522 may comprise one or more networkprocessors that execute instructions stored in a memory that cause thenetwork circuits 520 and 522 to perform various operations. For example,the network circuits 520 and 522 may each comprise a read-only memory(ROM) that stores firmware that may be executed by the one or morenetwork processors. Examples of ROM include programmable read-onlymemory (PROM), erasable programmable read-only memory (EPROM), andelectrically erasable programmable read-only memory (EEPROM).Additionally (or alternatively), the network circuits 520 and 522 maycomprise a read-write memory (e.g., a memory that is both readable andwritable) that stores instructions that may be executed by the one ormore network processors.

It should be appreciated that the network interface component(s) 502 maybe implemented as one or more circuit dies. For example, the networkinterface component(s) 502 may be implemented as a single circuit die.In another example, the first network circuit 510 may be implemented asa first circuit die and the second network circuit 522 may beimplemented as a second circuit die. Further, the network interfacecomponent(s) 502 may comprise more (or fewer) network circuits thatfacilitate communication over more (or fewer) communication protocols.For example, the network interface component(s) 502 may comprise threenetwork circuits including a first network circuit configured tofacilitate communication over at least one WLAN (e.g., a WIFI network),a second network circuit configured to facilitate communication over atleast one PAN (e.g., a BLUETOOTH network), and a third network circuitconfigured to facilitate communication over a cellular network (e.g., a4G network, an LTE network, and/or a 5G network). Thus, the networkinterface component(s) 502 may be implemented in any of a variety ofways.

The processing components 504 may be coupled to the network interfacecomponent(s) 502 and configured to control one or more aspects of theoperation of the device 500. The processing components 504 may comprisefirst processor(s) 524 and second processor(s) 526. The firstprocessor(s) 524 may have a different construction than the secondprocessor(s) 526. Additionally, the first processor(s) 524 may executefirst operating system(s) 528 while the second processors 526 mayexecute second operating system(s) 530 that are different from the firstoperating system(s) 528.

In some embodiments, the first processor(s) 501 may not be configured tosupport virtualized memory and the first operating system(s) 528 maycomprise an operating system that does not require support forvirtualized memory, such as a RTOS or other SPOS. For example, the firstprocessor(s) 524 may not comprise a memory management unit (MMU)configured to translate virtual memory addresses to physical addresses.In these embodiments, the first processor(s) 524 may comprise ageneral-purpose processor (GPP), such as a reduced instruction setcomputer (RISC) processor, and/or a single-purpose processor (SPP), suchas a DSP, a graphics processing unit (GPU), or a neural processing unit(NPU). For example, the first processor(s) 524 may comprise a RISCprocessor and a DSP. Example GPPs that do not support virtualized memoryinclude ARM CORTEX-M series processors (e.g., CORTEX-MO, CORTEX-MO+,CORTEX-M1, CORTEX-M3, CORTEX-M4, CORTEX-M7, CORTEX-M23, CORTEX-M33, andCORTEX-M35P processors). Example SPPs that do not support virtualizedmemory include TENSILICA HIFI DSPs (e.g., HIFI MINI, HIFI 3, HIFI 3z,HIFI 4, and HIFI 5 DSPs).

In some embodiments, the second processor(s) 526 may be configured tosupport virtualized memory and the second operating system(s) 530 maycomprise an operating system that at least partially employs virtualizedmemory, such as a GPOS. For example, the second processor(s) 526 maycomprise a memory management unit (MMU) configured to translate virtualmemory addresses to physical addresses. In these embodiments, the secondprocessor(s) 526 may comprise a GPP. Example GPPs that supportvirtualized memory include application processors such as ARM CORTEX-Aseries processors (e.g., CORTEX-A5, CORTEX-A7, CORTEX-A8, CORTEX-A9,CORTEX-M2., CORTEX-A15, CORTEX-A17, CORTEX-A32, CORTEX-A35, CORTEX-A53,CORTEX-A57, CORTEX-A72. CORTEX-A73, CORTEX-A75, CORTEX-A76 processors).

One or more of the processors in the plurality of processing components504 (e.g., first processor(s) 524 and/or second processor(s) 526) mayhave a plurality of power states including an awake state and one ormore low-power states (e.g., one or more sleep states such as a lightsleep state and a deep sleep state). In an awake state, the processormay be capable of executing instructions, power may be maintained to theprocessor caches (e.g., L1, L2, and/or L3 caches), and the clocks may beon (e.g., core clock, bus clock, etc.). In light sleep states, the powerconsumption may be reduced relative to the awake states by, for example,turning off (or lowering the frequency of) one or more clocks whilemaintaining power to the processor caches. Thus, light sleep states mayoffer some power consumption reduction relative to awake states whilestill being able to transition to awake states expeditiously. In deepsleep states, the power consumption may be reduced relative to the lightsleep states by, for example, both turning off one or more clocks andpowering down one or more processor caches. Deep sleep states mayinclude those states where the processor is entirely powered off. Thus,deep sleep states may offer an additional power consumption reductionrelative to light sleep states and require additional time to transitionto awake states relative to light sleep states.

Given that the first processor(s) 524 may have a different constructionthat the second processor(s) 526, the first processor(s) 524 may have adifferent peak power consumption (e.g., power consumption under fullload) than the second processor(s) 526. For example, the firstprocessor(s) 524 may have a lower peak power consumption than the secondprocessor(s) 526. The difference in power consumption may arise at leastin part from the increased complexity of the second processor(s) 526 toprovide, for example, virtual memory support. Thus, in some embodiments,operations are distributed between the first processor(s) 524 and thesecond processor(s) 526 such that only those operations that cannot bepractically performed by the first processor(s) 524 are performed by thesecond processor(s) 526. In these embodiments, the first processor(s)524 may cause the second processor(s) 526 to remain in a low-power stateuntil a particular operation needs to be performed that requires thesecond processor(s) 526. As a result, the second processor(s) 526 may,in at least some respects (and/or situations), function as one or moreco-processors to the first processor(s) 524.

The data storage 506 may comprise, for example, one or more tangible,non-transitory, computer-readable media configured to store instructionsthat are executable by the processing components 504. The data storage506 may comprise any combination of volatile memory (e.g., a memory thatonly maintains data while powered) and non-volatile memory (e.g., amemory that maintains data even after being power cycled). Examples ofvolatile memory include random-access memory (RAM) such as staticrandom-access memory (SRAM) and dynamic random-access memory (DRAM).Examples of non-volatile memory include flash memory, such as NOR flashmemory and NAND flash memory, disk drives, and magnetic tape.

The data storage 506 may comprise a first memory 532 and a second memory534. In some embodiments, the first memory 532 may be only directlyaccessible by the first processor(s) 524 (and thus not be directlyaccessible by the second processor(s) 526) and the second memory 534 maybe only directly accessible by the second processor(s) 526 (and thus notbe directly accessible by the first processor(s) 524). In theseembodiments, the first and second processor(s) 524 and 526,respectively, may share information via one or more communication buses,such as a SPI bus. In other embodiments, at least one of the firstmemory 532 and the second memory 534 may be a shared memory that isdirectly accessible by both the first processor(s) 524 and the secondprocessor(s) 526. In these embodiments, the first and secondprocessor(s) 524 and 526, respectively, may share information by storingthe information to be shared in the shared memory. Additionally (oralternatively), the first and second processor(s) 524 and 526,respectively, may share information via one or more communication buses.

It should be appreciated that the processing components 504 and the datastorage 506 may be implemented in any of a variety of ways. In someembodiments, each of the first processor(s) 524 are separate anddistinct from the second processor(s) 526. For example, the firstprocessor(s) 524 may combined with at least part of the first memory 532in a first system-on-chip (SoC) and the second processor(s) 526 may becombined with at least part of the second memory 534 in a second SoCthat is separate from the first SoC. In other embodiments, the firstprocessor(s) 524 may be combined with the second processor(s) 526 in asingle circuit die. For example, the first processor(s) 524, the one ormore circuit processors 526, at least part of the first memory 532, andat least part of the second memory 534 may be integrated into a singleSoC. Thus, the processing components 504 and the data storage 506 may beimplemented in any number of circuit dies.

The electronic component(s) 508 may comprise any of a variety ofcomponents that the processing components 504 may control or otherwisecommunicate with. Examples of such components include: a display, anelectric motor, a heating element, a switch, a speaker, a light, and asensor (e.g., a microphone, a capacitive touch sensor, an infrared lightsensor, etc.). The implementation of the electronic component(s) 508 mayvary based on the particular function of the device 500. For example,the device 500 may be a playback device and the electronic component(s)508 may comprise a speaker for sound reproduction and one or morecapacitive touch sensors for detection of audio playback commands (e.g.,play/pause, increase volume, decrease volume, etc.).

Some electronic component(s) 508 may not directly interface with theprocessing components 504. Instead, these electronic component(s) 508may interface with the processing components 504 via intermediarycomponent(s) 514. For example, the electronic component(s) 508 maycomprise a capacitive touch sensor that the processing components 504may not be able to directly read. In this example, the intermediarycomponent(s) 514 may comprise a programmable SoC (PSoC) that isconfigured to read the capacitive touch sensor and provide an outputover a communication bus (e.g., an I2C bus) that may be received by theprocessing components 504. Other example intermediary component(s) 514include audio codecs and amplifiers (e.g., class D audio amplifiers).

In some embodiments, only the first processor(s) 528 communicate (e.g.,is communicatively coupled) with the intermediary component(s) 514and/or the electronic component(s) 508. Thus, the second processor(s)526 may not directly communication with the intermediary component(s)514 and/or electronic component(s) 508. By routing all communicationwith the intermediary component(s) 514 and/or the electroniccomponent(s) 508 through the first processor(s) 524, the secondprocessor(s) 526 may be completely turned off without interfering withsuch communication. For example, the first processor(s) 528 maycommunicate with intermediary component(s) 514 over an I2C bus that isnot directly accessible by the second processor(s) 526 (e.g., the secondprocessor(s) 526 cannot directly transmit and/or receive data over theI2C bus). In other embodiments, both the first processor(s) 528 and thesecond processor(s) 526 can communicate (e.g., are communicativelycoupled) with the intermediary component(s) 514 and/or the electroniccomponent(s) 508.

In some embodiments, the second processor(s) 526 may be booted beforethe first processor(s) 524. For example, the second processor(s) 526 mayinitially boot first and provide code to the first processor(s) 524 overa communication bus, such as a SPI bus, and/or a shared memory, such asa shared RAM. The first processor(s) 524 may boot upon receipt of thecode from the second processor(s) 526. Once the first processor(s) 524have completed booting, the second processor(s) 526 may be put in alow-power state should the second processor(s) 526 no longer be needed.In other embodiments, the first processor(s) 524 may be booted beforesecond processor(s) 526.

It should be appreciated that the first processor(s) 524 may boot atleast partially in parallel with the second processor(s) 526. Forexample, the second processor(s) 526 may start booting first and, duringthe boot process, cause the first processor(s) 524 to boot (e.g., via atleast one trigger signal, by providing code to the first processor(s)524, or any combination thereof). In this example, the secondprocessor(s) 526 may complete the remainder of the boot process at leastpartially in parallel with the first processor(s) 524 booting. Inanother example, the first processor(s) 524 may start booting first and,during the boot process, cause the second processor(s) 526 to boot(e.g., via at least one trigger signal, by providing code to the secondprocessor(s) 526, or any combination thereof). In this example, thefirst processor(s) 524 may complete the remainder of the boot process atleast partially in parallel with the second processor(s) 526 booting.

It should be appreciated that one or more components may be omitted fromthe device 500 without departing from the scope of the presentdisclosure. In some embodiments, the device 500 may only communicateusing a single protocol (or set of protocols), such as IEEE 802protocols, and the second network circuit 522 that enables communicationwith the second computing device 516 may be omitted. Additionally (oralternatively), the electronic component(s) 508 in the device 500 maynot need any of the intermediary component(s) 514. For example, theelectronic component(s) 508 may only include components that maydirectly interface with the processing components 504. Thus, theintermediary component(s) 514 may be omitted.

In some embodiments, aspects of the distributed architecture shown inFIG. 5 may be distributed between two playback devices. In theseembodiments, a first subset of the components shown in FIG. 5 may beintegrated into a first playback device and a second subset of thecomponents shown in FIG. 5 (which may or may not overlap with thecomponents of the first subset) may be integrated into a second playbackdevice (e.g., that may be communicatively coupled via a PAN, WLAN, orother connection to the first playback device). For example, a firstplayback device (e.g., a wearable device such as a headphone deviceand/or a hearable device) may comprise the first processor(s) 524 whilea second playback device (e.g., a stationary playback device such as asoundbar) may comprise the second processor(s) 526. In this example, thefirst playback device may offload those intensive operations that maynot be suitable for the first processor(s) 524 to the second playbackdevice to be performed using the second processor(s) 526. By employingsuch an architecture, the power consumption of the first playback devicemay be advantageously kept very low (e.g., to maximize battery life)while still supporting complex operations when connected to the a commondata network as the second playback device. Accordingly, the techniquesdescribed herein may be readily employed in a system of two or moredevices (e.g., playback devices).

In some embodiments, aspects of the distributed architecture shown inFIG. 5 may be integrated into a module (e.g., a circuit board assemblysuch as a system-on-a-module (SoM)) for easy integration into a device.An example of such a module implementation is shown in FIG. 6 by module600. As shown, the module 600 comprises circuit board(s) 602 onto whichvarious components may be attached including processing components 504,data storage 506, and power component(s) 612. The network interfacecomponent(s) 502 may be partially integrated into the module 600. Forexample, internal network interface component(s) 616A may be mounted tothe circuit board(s) 602 and communicate via communication interface 604with external network interface component(s) 616B that are not attachedto the circuit board(s) 602. Similarly, the intermediate components 514may be partially integrated into the module 600. For example, internalintermediary component(s) 610A may be mounted to the circuit board(s)602 and communicate via electronic component interface 607 with externalintermediary component(s) 616B that are not attached to the circuitboard(s) 602.

The circuit board(s) 602 may comprise a substrate (e.g., an insulativesubstrate) and a plurality of conductive elements (e.g., circuit traces,pads, vias, etc.). The substrate may provide mechanical support for thecomponents mounted to the circuit board(s) 602. The substrate may be arigid substrate (e.g., to form a rigid circuit board) or a flexiblesubstrate (e.g., to form a flexible circuit board). The plurality ofconductive elements may be disposed on and/or integrated with thesubstrate to couple (e.g., electrically couple) components attached tothe circuit board(s) 602.

The power component(s) 612 may distribute power to one or more othercomponents of the module 600 (e.g., other components attached to thecircuit board(s) 602). The power component(s) 614 may perform, forexample, any combination of the following operations: (1) DC/DCconversion, (2) battery charging, and (3) power sequencing. The powercomponent(s) 614 may be implemented as, for example, a power managementintegrated circuit (PMIC). The power component(s) 612 may receive powerfrom a power source 614 via a power interface 604. The power source 614may comprise an internal power source, such as a battery, and/or anexternal power source, such as a wall outlet. The power interface 604may comprise one or more ports (e.g., one or more electrical connectorsattached to the circuit board(s) 602) where the module 600 may becoupled (e.g., electrically coupled) to the power source 604.

The processing components 504 and the data storage 506 may be attachedto the circuit 602 in a variety of ways depending on, for example, howthe processing components 504 and the data storage 506 are constructed.In some embodiments, the processing components 504 and the data storage506 may be integrated into a single system-on-a-chip (SoC) that may beattached to the circuit board(s) 602. In other embodiments, theprocessing components 504 and the data storage 506 may be integratedinto separate circuit dies that may be separately attached to thecircuit board(s) 602 (e.g., and electrically coupled using circuittraces). For example, first processor(s) (e.g., first processor(s) 524)and a first portion of the data storage 506 (e.g., a volatile memoryaccessible by the first processor(s)) may be integrated into a firstSoC, the second processor(s) (e.g., second processor(s) 526) and asecond portion of the data storage 506 (e.g., a volatile memoryaccessible by the second processor(s)) may be integrated into a secondSoC, and a remainder of the data storage 506 (e.g., a non-volatilememory accessible by the first and/or second processors) may beintegrated into a separate memory integrated circuit (IC). In thisexample, each of the first SoC, the second SoC, and the memory IC may beattached to the circuit board(s) 602. Thus, the processing components504 and the data storage 506 may be distributed between any number ofICs that may be attached to the circuit board(s) 602.

The network interface component(s) 502 may be distributed between theinternal network interface component(s) 616A that may be attached to thecircuit board(s) 602 and the external network interface component(s)616B that may be external to the module 600. The internal networkinterface component(s) 616A may be coupled to the external networkinterface component(s) 616B via a communication interface 606. Thecommunication interface 606 may comprise one or more ports (e.g., one ormore electrical connectors attached to the circuit board(s) 602) wherethe module 600 may be coupled (e.g., electrically coupled) to theexternal network interface component(s) 616B. The particular way inwhich the network interface component(s) 502 are distributed may varybased on the particular implementation. In some embodiments, theinternal network interface component(s) 616A may comprise one or moreICs to generate wireless signals including, for example, one or morewireless transceiver ICs (e.g., a WIFI transceiver IC, a BLUETOOTHtransceiver IC, or a WIFI and BLUETOOTH transceiver IC, a cellulartransceiver IC, etc.) while the external network interface component(s)616B may comprise one or more components that radiate the wirelesssignal (e.g., one or more antennas). In other embodiments, all of thenetwork interface component(s) 502 may be integrated into the internalnetwork interface component(s) 616A and the communication interface 606may be removed. In still yet other embodiments, all of the networkinterface component(s) 502 may integrated into the external networkinterface component(s) 616B and the communication interface 606 maycouple the processing components to the external network interfacecomponent(s) 616B.

The intermediary component(s) 514 may be distributed between theinternal intermediary component(s) 610A that may be attached to thecircuit board(s) 602 and the external intermediary component 610B thatmay be external to the module 600. The internal intermediarycomponent(s) 610A may be coupled to the external intermediary component610B via an electronic component interface 608. The electronic componentinterface 608 may comprise one or more ports (e.g., one or moreelectrical connectors attached to the circuit board(s) 602) where themodule 600 may be coupled (e.g., electrically coupled) to the externalintermediary component(s) 610B. The particular way in which theintermediary component(s) 514 are distributed may vary based on theparticular implementation. In some embodiments, all of the intermediarycomponent(s) 514 may be integrated into the internal network interfacecomponent(s) 616A. For example, the internal intermediary component(s)610A may comprise one or more audio amplifiers that are coupled (via theelectronic component interface 608) to electronic component(s) 508, suchas one or more speakers. In other embodiments, each of the internalintermediary component(s) 610A and the external intermediarycomponent(s) 610B may comprise at least one component. In still yetother embodiments, all of the intermediary component(s) 514 may beintegrated into the external network interface component(s) 616B.

It should be appreciated that the module 600 shown in FIG. 6 may bemodified without departing from the scope of the present disclosure. Insome embodiments, the power components 612 may be made external to themodule 600. In this example, the power interface 604 may couple theexternal power components 612 to one or more components attached to thecircuit board(s) 602 (e.g., the processing component 504 and/or the datastorage 506).

IV. Example Power Management Techniques

FIG. 7 illustrates examples of power states through which a playbackdevice (e.g., employing at least some of the components shown in device500 such as the plurality of processing components 504) may transitionto facilitate lowering power consumption while still enabling supportfor complex features, such as audio streaming over WIFI. As describedabove, certain components (e.g., a sophisticated processor, such as anapplication processor) can be put in a low-power state (including beingturned off) in situations where complex operations that may necessitatethose components are unlikely to be performed (and/or not supported foruse in the given situation). For example, a battery-powered playbackdevice, such as portable and/or wearable playback devices, cantransition between various power states in concert with detected changesin the operating environment (and/or a current operating mode) of theplayback device to facilitate maximizing run time/battery life.

As shown, the set of power states includes first through sixth states,denoted as P1 through P6. The position of a particular power state alongthe vertical axis is indicative of the power consumed by the playbackdevice while in that power state. For example, P1 corresponds to thehighest power state, followed by P2, and P6 corresponds to the lowestpower state.

In the P1 state, the plurality of processing components 504 (e.g., firstprocessor(s) 524 and second processor(s) 526) are in an awake state. Asnoted above, in the awake state, the processors may be capable ofexecuting instructions, power may be maintained to the processor caches(e.g., L1, L2, and/or L3 caches), and the clocks may be on (e.g., coreclock, bus clock, etc.).

In the P2 state, the second processor(s) 526 are in a light sleep state.For example, as noted above, clocks of the second processor(s) 526 maybe turned off or lowered in frequency (e.g., to a minimum frequency).The first processor(s) 524 may be in the awake state. That is, the powerand clocks of the first processor(s) 524 may be configured to facilitatethe performance of operations by the first processor(s) 524.

In the P3 state, the second processor(s) 526 and the first processor(s)524 may both be in the light sleep states. That is, one or more clocksof the second processor(s) 526 and the first processor(s) 524 may beturned off or lowered in frequency (e.g., to a minimum frequency).

In the P4 state, the second processor(s) 526 may be in a deep sleepstate. For example, a voltage supplied to the second processor(s) 526can be reduced (e.g., to zero or near zero) and one or more clocks ofthe second processor(s) 526 may be turned off. In the P4 state, thefirst processor(s) 524 may be in an awake state.

In the P5 state, the second processor(s) 526 may be in a deep sleepstate. For example, a voltage supplied to the second processor(s) 526can be reduced (e.g., to zero or near zero) and one or more clocks ofthe second processor(s) 526 may be turned off. In the P5 state, thefirst processor(s) 524 may be in the light sleep state.

In the P6 state, the second processor(s) 526 and the first processor(s)524 may both be the may be in a deep sleep state. For example, voltagesupplied to both the second processor(s) 526 and the first processor(s)524 can be reduced (e.g., to zero or near zero) and one or more clocksof the second processor(s) 526 and the first processor(s) 524 may beturned off. In P6 state, the current drawn by the playback device may bereduced to, for example, leakage current associated with, among otherthings, power supply circuitry of the playback device.

While six power states are illustrated it is understood, that there canbe a different number of power states. That is, the amount of powerdrawn by the playback device can fall within a different number ofdiscrete states (e.g., greater or fewer than six states). For example,while the processor(s) described above are specified as being capable oftransitioning between awake, light sleep, and deep sleep states, therecan be any number of sub-power states within these power states. As anexample, a particular portion of the second processor(s) 526 can bede-powered or de-clocked to minimize power consumption associated withthat portion. At the same time, the second processor(s) 526 can beactively performing operations that do not require that portion.Therefore, even though awake, the power consumed by the secondprocessor(s) 526 will be less than the power otherwise consumed by thesecond processor(s) 526 when all the portions are active. Portions ofthe first processor(s) 524 can be similarly de-powered or de-clocked tominimize power consumption associated with these portions.

In some embodiments, a playback device (e.g., implementing one or morecomponents of device 500 such as processing components 504) maytransition between various power states (e.g., power states P1-P6 above)based on a current mode of operation of the playback device. Forexample, the playback device may have a plurality of modes of operationthat each have different processing needs (e.g., some modes may supportcomplex operations while other modes may not). Thus, processor(s) (orother components) that are not needed to support the particularoperations associated with a given mode of operation may be put in alow-power state (e.g., a sleep state such as a light sleep or deep sleepstate). In some instances, the playback device may employ a set of oneor more criteria that cause the playback device to transition betweencertain modes of operation (e.g., trigger events). FIG. 15 illustratesan example state diagram including a plurality of modes of operation ofa playback device, associated trigger events that may cause transitionsbetween the plurality of modes of operation, and example power states(referencing FIG. 7) in which the playback device may be in whileoperating in each mode. Upon entry into at least some of the pluralityof modes, the playback device may transition between power states (e.g.,transition between one or more of power states P1-P6). As shown, theplurality of operation modes includes an off mode 1502, a home mode1504, a home idle mode 1506, an away mode 1508, and an away idle mode1510.

In the off mode 1502, the playback device may be powered off. Forexample, the playback device 1502 may be incapable of performing most(if not all) functionality associated with the playback device in theoff mode 1502. The playback device 1502 may put most (if not all) of thecomponents of the playback device 1502 in a low power mode (includingturning them off). For example, the playback device 1502 may be in thepower state P6. The power consumption of the playback device may be at aminimum when operating in the off mode.

Upon detecting a trigger event, the playback device may transition fromthe off mode 1502 to the home mode 1504. As shown, the trigger eventthat causes the playback device to transition from the off mode 1502 tothe home mode 1504 may include, for example, detecting activation of anon button or other element of an interface on the playback device. Inthe home mode 1504, the playback device may be expected to be in anenvironment where a WLAN is likely available (e.g., inside a homeassociated with a user (e.g., a home of the user, a home of a friend ofthe user, etc.), nearby a home of a user (e.g., outside in a backyard),in a hotel room associated with the user (e.g., a hotel room of theuser, a hotel room of a friend of the user, etc.), at a workplaceassociated with the user (e.g., a workplace of the user, a workplace ofa friend of the user, etc.)) and there may (or may not) be a connectionover a PAN to a user device. Given the environment of the playbackdevice when in home mode 1504, the playback device may be expected to(and capable of) perform most (if not all) of the functionalityassociated with the playback device. For example, the playback devicemay support one or more of the following functions: (1) playback ofaudio streamed from an audio service provider over a WLAN (e.g., whenconnected to a WLAN); (2) playback of audio from another device (e.g.,another playback device, such as a soundbar) on a WLAN (e.g., whenconnected to a WLAN); (3) playback of audio streamed from a user deviceover a PAN (e.g., when connected to a PAN); (4) voice commandsassociated with a VAS provider where the captured voice utterance issent to a remote server over a WLAN (e.g., when connected to a WLAN);and/or (5) call handling (e.g., playback audio from the call and outputdetected voice utterances on the call) for calls received over a PAN viaa user device (e.g., when connected to a PAN).

Given the high-level of functionality that may be invoked by a user whenthe playback device is in the home mode 1504, the playback device mayput most (if not all) of the components of the playback device in a highpower state (e.g., an awake state) to support the wide range offunctionality that may be invoked by the user. For example, the playbackdevice may be in the power state P1 when in home mode 1504. The powerconsumption of the playback device may be at a maximum when operating inthe home mode.

Upon detecting a trigger event, the playback device may transition fromthe home mode 1504 to the away mode 1508. As shown, the trigger eventthat causes the playback device to transition from the home mode 1504 tothe away mode 1508 may include, for example, detecting a loss of aconnection to the WLAN (e.g., one or more performance metrics associatedwith the connection (e.g., RSSI, packet loss rate, etc.) fall below athreshold (e.g., remain below the threshold for a minimum period oftime)). In the away mode 1508, the playback device may be expected to bein an environment where a WLAN is not available (e.g., on a commuteto/from work, outside walking on a street, on a hike, etc.) and theremay (or may not) be a connection over a PAN to a user device. Given theloss of the connection to the WLAN, the playback device may supportfewer functions (and/or different functions) in away mode 1508 than aresupported in home mode 1504. For example, in the away mode 1508, theplayback device may support one or more of the following operations: (1)playback of audio streamed from a user device over a PAN (e.g., whenconnected to a PAN); and/or (2) call handling (e.g., playback audio fromthe call and output detected voice utterances on the call) for callsreceived over a PAN via a user device (e.g., when connected to a PAN).Further, in the away mode 1508, the playback may not support one or moreof the following operations: (1) playback of audio streamed from anaudio service provider over a WLAN; (2) playback of audio from anotherdevice (e.g., another playback device, such as a soundbar) on a WLAN;and/or (3) voice commands associated with a VAS provider where thecaptured voice utterance is sent to a remote server over a WLAN.

Given the difference in supported functions between the home mode 1504and the away mode 1508, one or more components of the playback device(e.g., at least one of the application processor(s) in the playbackdevice) may be put into a low power state to reduce the powerconsumption of the playback device. For example, the playback device maybe in the power state P4 when in away mode 1508. The power consumptionof the playback device may be at a level that is lower than whenoperating in home mode 1504. Upon detection of a trigger event, such asa WLAN connection becoming available (e.g., the presence of a WLANconnection being detected by the playback device), the playback devicemay transition from the away mode 1508 to the home mode 1504.

In some instances, the plurality of operating modes may comprise one ormore idle modes (and/or idle variations of other modes) where the powerconsumption of the playback device may be further reduced after sometrigger event associated with inactivity. As shown in FIG. 15, theplayback device may transition from a home mode 1504 to a home idle mode1506 after detecting some trigger event associated with inactivity. Theevent associated with inactivity may include, for example, one or moreof the following: (1) detecting a period of no user input for a minimumperiod of time; (2) detecting that the environment in which the playbackdevice is operating is dark (e.g., the playback device is in a backpackand unlikely to be used); and (3) detecting that the playback device isnot positioned for use (e.g., a headphone that is doffed or otherwisenot being worn). In the home idle mode 1506, the playback device mayconsume less power than in the home mode 1504 by putting one or morecomponents in a low power state while still retaining the ability toexpeditiously transition back to the home mode 1504 when needed. Forexample, the playback device may be in one of power states P2-P6 whilein home idle mode 1506. Conversely, the playback device may transitionfrom the home idle mode 1506 to the home mode 1504 upon detection of atrigger event associated with activity. The event associated withactivity may include, for example, one or more of the following: (1)detecting a user command (e.g., detecting activation of a user interfaceelement, detecting a command from a user device over a PAN and/or aWLAN, etc.); (2) detecting that the environment in which the playbackdevice is operating is bright (e.g., the playback device has beenremoved from a backpack and is in a well illuminated room); and (3)detecting that the playback device is positioned for use (e.g., aheadphone that is donned or otherwise being worn).

Additionally (or alternatively), the playback device may transition fromthe away mode 1508 to an away idle mode 1510 after detecting some eventassociated with inactivity. In the away idle mode 1510, the playbackdevice may consume less power than in the away mode 1508 by putting oneor more components in a low power state. For example, the playbackdevice may be in one of power states P5 or P6 while in away idle mode1510. Conversely, the playback device may transition from the away idlemode 1510 to the away mode 1510 upon detection of some activity(including any of the events associated with activity described herein).

It should be appreciated that the playback device may have more (orfewer) operating modes than are shown in FIG. 15. For example, the setof operating modes may further include a charging mode where theplayback device receives power from an external source (e.g., receivespower via a cable and/or a wireless charging base) and uses at leastsome of the power received from the external source to charge an energystorage device (e.g., a battery). The trigger event for enteringcharging mode (e.g., from any starting mode) may include, for example,detecting that the playback device is receiving power from an externalsource. Conversely, the trigger event for exiting the charging mode(e.g., and entering home mode 1504) may include, for example, detectingthat the playback device is no longer receiving power from an externalsource.

In some instances, it may be desirable to prohibit a user fromperforming one or more operations while the playback device is operatingin the charging mode (e.g., is being charged). For example, the playbackdevice may be a headphone and the temperature of the headphone maybecome too warm while charging to be safely worn by a user. In thisexample, the playback device may discourage as user from wearing theheadphone while in the charging mode by disabling one or more functionsof the playback device such as one or more of the following functions:(1) playback of audio streamed from an audio service provider over aWLAN (e.g., when connected to a WLAN); (2) playback of audio fromanother device (e.g., another playback device, such as a soundbar) on aWLAN (e.g., when connected to a WLAN); (3) playback of audio streamedfrom a user device over a PAN (e.g., when connected to a PAN); (4) voicecommands associated with a VAS provider where the captured voiceutterance is sent to a remote server over a WLAN (e.g., when connectedto a WLAN); and/or (5) call handling (e.g., playback audio from the calland output detected voice utterances on the call) for calls receivedover a PAN via a user device (e.g., when connected to a PAN). While theplayback device is in the charging mode, the playback device may putmost (if not all) of the components of the playback device in a highpower state (e.g., an awake state). For example, the playback device maybe in power state P1 while operating in a charging mode. By putting most(if not all) of the components of the playback in a high power statewhile in the charging mode (e.g., instead of turning off most of thecomponents of the playback device), the playback device may stillperform one or more background operations (e.g., including one or morepower intensive background operations that may be undesirable to performwhile not receiving power). For example, the playback device may performone or more background operations to maintain a connection to one ormore servers associated with one or more service providers (e.g., avoice assistant service provider and/or a streaming service provider)while in the charging mode. By maintaining the connection to the one ormore servers associated with the one or more service providers while inthe charging mode, the playback device may advantageously be able totransition from the charging mode to another mode, such as the home mode1504, faster than had the connection been stopped and needed to bereestablished.

In some implementations, the triggering events for transitioning betweenvarious operating modes (and/or power states) may be tailored to userbehavior over time. For example, the playback device may employ one ormore sensors (e.g., clocks, infrared sensors, microphones, wirelessradios, etc.) that may be employed to monitor the state of playbackdevice (e.g., last used 2 hours ago at 10:00 pm) and/or the environmentin which the playback device is in (e.g., in a dark area like abackpack). In this example, the playback device may identify patterns ofuser behavior based on such information and use those identifiedpatterns to intelligently modify the trigger events to match theidentified patterns of user behavior. For example, the playback devicemay identify that the playback device is infrequently used after 10:00μm in the evening when in home mode 1504 and modify trigger eventassociated with a transition from home mode 1504 to a home idle mode1506 such that the playback device automatically transitions from thehome mode 1504 to the home idle mode 1506 at 10:00 pm (if the playbackdevice is not already in the home idle mode 1506).

It should be appreciated that the playback device may employ any of avariety of signal processing techniques to adjust the trigger eventsbased on monitored user behavior. In some instances, the playback devicemay employ one or more filtering techniques (e.g., employing a digitalmoving average filter) to modify the trigger events. For example, theplayback device may monitor the time at which user inputs (e.g.,indicating that the user is requesting the playback device to perform anoperation) are detected throughout the day and identify the lastinteraction of the day (e.g., to identify a time after which the userhas likely started to sleep and is unlikely to use the playback deviceuntil the next morning). In this example, the playback device maycompute an average (e.g., a moving average) of the times associated withthe last user input of the day and, in turn, use that value to modify atrigger associated with transitioning from home mode 1504 to home idlemode 1506. In other instances, the playback device may employ othersignal processing techniques to adjust the triggering events based onmonitored user behavior such as machine learning techniques.

FIG. 8 illustrates examples of operations performed by a playback deviceimplementing the components of device 500 (shown in FIG. 5) duringinitial power-up or activation (e.g., a transition from off mode 1502 tothe home mode 1504 in FIG. 15). In an example, after initialization, thefirst processor(s) 524 and the second processors(s) 526 are in the awakestate and the playback device is in the P1 state (and/or home mode 1504)noted above.

At block 800, the playback device is powered on. That is, power may beapplied to the components illustrated in FIG. 5. The playback device maybe powered on in response to a user interaction with the control area(e.g., control area 258), such as a button press (including a longbutton press). In another example, removal of the playback device from acharging dock (e.g., charging dock 246) causes the playback device topower on. In yet another example, the playback device may include amotion sensor and the detection of motion by the motion sensor causesthe playback device to power on.

At block 805, after power is applied to the components illustrated inFIG. 5, the second processor(s) 526 may load initialization code (i.e.,boot code) from non-volatile memory of the data storage 506. An exampleof initialization involves setting registers of various components ofthe second processor(s) 526. Another example further involvesinitializing network code that facilitates network communication via thefirst communication link 512. For example, an 802.11 stack may beinitialized.

At block 810, in some examples, the second processor(s) 526 communicatesinitialization code (i.e., boot code) to the first processor(s) 524. Forexample, the boot code loaded from the non-volatile memory of the datastorage 506 may include a code segment that corresponds to the boot codefor the first processor(s) 524. This code segment can be communicated tothe first processor(s) 524 via an interface of the second processor(s)526 such as a SPI bus.

In another example, the first processor(s) 524 loads the boot codedirectly from the non-volatile memory of the data storage 506 or from adifferent non-volatile memory dedicated to the first processor(s) 524.

At block 815, the first processor(s) 524 initializes. An example ofinitialization involves setting registers of various components of thefirst processor(s) 524. Another example further involves initializingnetwork code that facilitates network communication via the secondcommunication link. For example, a BLUETOOTH stack may be initialized.

After initialization of the second processor(s) 526 and the firstprocessor(s) 524, the playback device may be in the P1 state (e.g.,and/or home mode 1504). As noted above, in P1 state, the power consumedby the playback device may be relatively high.

FIG. 9 illustrates examples of operations performed by a playback device(implementing the components of device 500) after the secondprocessor(s) 526 and the second processer(s) 526 have been initializedand the playback device is in, for example, the P1 state (e.g., and/orhome mode 1504). At block 900, the playback device receives audiocontent from a second remote device (e.g., second computing device 516)via the second communication link 518. The audio content may be playedback via speakers (e.g., speakers 218) of the playback device. Forexample, the second communication link 518 can correspond to a BLUETOOTHlink and a BLUETOOTH connection can be established between the playbackdevice and, for example, a user device (e.g., a smartphone, a tabletcomputer, a laptop, a desktop, etc.). The user device may stream audiocontent, via the BLUETOOTH connection to the playback device. TheBLUETOOTH processing performed by the playback device may be primarily(and/or entirely) implemented by the first processor(s) 524. In anexample, the second processor(s) 526 is not involved in the processingof the audio content received from the user device.

At block 905, an indication may be received as to whether a firstcommunication link 512 can be established between the playback device102 and a first remote device (e.g., first computing device 510). Forexample, the first processor(s) 524 may receive an indication from thefirst network circuit 520 of the network interface component(s) 502 asto whether the first communication link 512 can be established. In oneexample, the first communication link 512 corresponds to an Institute ofElectrical and Electronics Engineers (IEEE) 802.11 based communicationlink. In an example, the indication that the first communication link512 can be established is communicated to the first processor(s) 524when an 802.11 signal is detected by the first network circuit 520 ofthe network interface component(s) 502 (e.g., the power associated withthe signal is above a pre-determined threshold such as −30 dbm). Inanother example, the indication is communicated to the firstprocessor(s) 524 when a service set identifier (SSID) associated withthe first communication link 512 matches a predetermined SSID. Forexample, the predetermined SSID may correspond to the SSID of the homenetwork of the playback device user. When the playback device 102 iswithin range of the home network, the indication that the firstcommunication link 512 can be established is communicated from the firstnetwork circuit 520 of the network interface component(s) 502. Inanother example, the indication that the first communication link 512can be established is communicated to the second processor(s) 526 (e.g.,via an interrupt signal). Additional example techniques by which anetwork circuit can be configured to output a signal when a particularSSID is identified are described in PCT Publication No. WO/2020/150595,published on Jul. 23, 2020, titled “Power Management Techniques forWaking-Up Processors in Media Playback Systems,” which is incorporatedherein by reference in its entirety.

At block 910, if the first communication link 512 can be established,then at block 925, the second processor(s) 526 is either maintained inthe awake state, or transitioned to the awake state to facilitatereceiving audio content or other information via the first communicationlink 512 (e.g., stay in home mode 1504 or transition to home mode 1504if not in home mode 1504 such as transitioning from away mode 1508 tohome mode 1504). For example, if the current power state of the playbackdevice is P1, the second processor(s) 526 is already awake (e.g.,clock(s) of the first processor(s) 524 are running). In this case, thesecond processor(s) 526 can proceed to establish a communication linkwith one or more remote devices, such as a controller device 104, anaudio streaming service provider, and/or a different playback device 102from which audio content can be streamed.

If the current power state of the playback device corresponds to the P2state (e.g., second processor(s) 526 are in a light sleep state) orhigher (P3-P6), an indication may be communicated from the firstprocessor(s) 524 to the second processor(s) 526 to cause the secondprocessor(s) 526 to transition to the awake state, or as indicatedabove, the indication can be communicated directly to the secondprocessor(s) 526. For example, an interrupt to the second processor(s)526 may be generated by the first processor(s) 524 or received directlyfrom the network interface component(s) 502. In response, the clock(s)of the second processor(s) 526 may be activated or the respectivefrequencies of the clocks may be shifted to a nominal operatingfrequency that facilitates streaming of audio content by the secondprocessor(s) 526.

If at block 910, the first communication link 512 cannot be established,then at block 920, the second processor(s) 526 is controlled totransition to a lower power state, such as the light sleep state and/orthe deep sleep state (e.g., transition from home mode 1504 to a lowerpower mode such as an away mode 1508). That is, the playback device 102may transition to, for example, the P4 state. As shown in block 915, insome examples, the second processor(s) 526 is controlled to transitionto a lower power state only after a predetermined amount of time haselapsed (e.g., 10 seconds). This can prevent unnecessarily bringing downthe stack associated with the first communication link 512 when there isa momentary loss of signal, which can occur when, for example, the userof the playback device 102 moves, momentarily, beyond range of a WIFIrouter in the home. In some examples, after a second predeterminedamount of time has elapsed (e.g., 1 minute), the second processor(s) 526is controlled to transition to a still yet lower power state (e.g., froma light sleep state to deep sleep state or from a first deep sleep stateto a second deep sleep state with a lower power consumption than thefirst deep sleep state) to save additional power.

FIG. 10 illustrates examples of operations performed by a playbackdevice (implementing the components of device 500) when the playbackdevice is in the P1 state (e.g., a home mode 1504).

At block 1000, the second processor(s) 526 and the second processer(s)526 are in an awake state (e.g., in home mode 1504) and can process, forexample, audio content or information received via the firstcommunication link 512 and the second communication link 518,respectively. In some examples, the second processor(s) 526 can processother information communicated via the first communication link 512,such as receiving commands from and reporting information to acontroller.

In an example, the first processor(s) 524 forwards informationassociated with the audio content received via the second communicationlink 518 to the second processor(s) 526. The second processor(s) 526then forwards the information via the first communication link 512 to,for example, a controller. For example, information associated withcontent being played, such as the name and artist associated with a songbeing played, can be forwarded via the first communication link 512 tothe controller. In some examples, an indication of the length of thesong and the position within the song associated with the portion of thesong being played can be communicated.

In some examples, information received via the first communication link512 can be used to control audio content being communicated via thesecond communication link 518. For example, a pause indication, nextsong indication, previous song indication, etc., can be received from acontroller via the first communication link 512, forwarded by the secondprocessor(s) 526 to the first processor(s) 524 (e.g., via the SPI bus)and then forwarded to a remote device (e.g., mobile phone). The remotedevice can then process the indication communicated from the controlleraccordingly.

If at block 1005, an amount of inactivity time of the firstcommunication link 512 exceeds a threshold, then at block 1010, thefirst processor(s) 524 and/or the second processor(s) 526 can becontrolled to transition to a sleep state (e.g., a home idle mode 1506).That is, the playback device transitions to, for example, any of statesP2-P6. For example, if no audio content or user initiated commands arecommunicated via the first communication link 512 for a predeterminedamount of time (e.g., 10 seconds), the second processor(s) 526 can becontrolled to transition to the light sleep state. In some examples,after a second predetermined amount of time has elapsed (e.g., 1minute), the second processor(s) 526 is controlled to transition to thedeep sleep state to save additional power.

In some examples, a timer process operating on the first processer(s)524 controls the second processor(s) 526 to transition to the lightand/or deep sleep states. In other examples, a timer process operates onthe first processor(s) 524 and, upon expiry, causes the firstprocessor(s) 524 to send a command (e.g., via a SPI bus) to control thesecond processor(s) 526 to transition to the light and/or deep sleepstates.

In some examples, if an amount of inactivity time of the secondcommunication link 518 exceeds a threshold, the first processor(s) 524and/or the second processor(s) 526 can be controlled to transition to asleep state (e.g., any of states P2-P6). For example, if no audiocontent or user initiated commands are communicated via the secondcommunication link 518 for a predetermined amount of time (e.g., 10seconds), the first processor(s) 524 can be controlled to transition tothe light sleep state. In some examples, after a second predeterminedamount of time has elapsed (e.g., 1 minute), the first processor(s) 524is controlled to transition to the deep sleep state to save additionalpower. That is, depending on whether the second processor(s) 526 is inthe light or deep sleep state, the playback device 102 transitions toeither the P5 or P6 states. In some examples, a timer process operatingon the second processer(s) 526 controls the first processor(s) 524 totransition to the light and/or deep sleep state.

If at block 1015, a resume indication is received, then at block 1020,the second processor(s) 526 is controlled to transition to an awakestate (e.g., transition from a home idle mode 1506 to home mode 1504)to, for example, facilitate communicating information via the firstcommunication link 512. In some examples, both the second processor(s)526 and the first processor(s) 524 transition to the awake state. Thatis, the playback device 102 transitions to the P1 state.

In one example, a resume indication occurs when the user of the playbackdevice initiates an action via the control area (e.g., control area258), such as an action to immediately establish communications with astreaming service, controller, and/or another playback device. Inanother example, a resume indication occurs when the user removes theplayback device from a charging dock (e.g., charging dock 246). Inanother example, a resume indication can occur at a particular interval(e.g., every 5 minutes).

While several techniques have been described above for managing powerconsumption, other techniques are contemplated. For example, in anotherexample, user interaction with the playback device can be monitored todetermine typical periods of activity and inactivity associated with theplayback device. Instruction code operating on the first processor(s)524 and/or the second processor(s) 526 can implement machine learninglogic that can be trained over time to predict these periods of activityand inactivity. During predicted periods of activity, the machinelearning logic can, for example, transition the second processor(s) 526to the awake state to facilitate communicating information to theplayback device via the first communication link 512. During predictedperiods of inactivity, the machine learning logic can, for example,transition the second processor(s) 526 to the light sleep state toconserve power.

V. Example Distributed Audio Processing Techniques

FIG. 11 illustrates an example of a distributed audio processingenvironment 1100 in which processing operations associated with theplayback of audio content are distributed among multiple processors(e.g., within a single playback device and/or between multiple playbackdevices). In this case, the processing operations are distributedbetween multiple processors shown as processor(s) 1101A and processor(s)1101B, which may have different constructions. In some embodiments, theprocessor(s) 1101A may be implemented as one or more GPPs that supportmemory virtualization (e.g., as one or more application processors)while the processor(s) 1101B may be implemented as one or more GPPsand/or one or more SPPs that do not support memory virtualization. Forexample, with respect to FIG. 5, the processor(s) 1101A may correspondto the second processor(s) 526 while the processor(s) 1101B maycorrespond to the first processor(s) 524. With respect to FIG. 15, thedistributed audio processing techniques may be employed by a playbackdevice operating in, for example, home mode 1504 to support one or morefunctions that involve audio playback (e.g., playback of audio streamedfrom an audio service provider over a WLAN; playback of audio fromanother device, such as another playback device, on a WLAN; playback ofa response to voice commands associated with a VAS provider, etc.).

In some embodiments, the processing operations may be distributedbetween a first playback device that comprises the processor(s) 1101Aand a second playback device that comprises the processor(s) 1101B. Inthese embodiments, the first and second playback devices may havedifferent form factors. For example, the first playback device may beimplemented as a stationary playback device while the second playbackdevice may be implemented as a wearable device such as a headphonedevice (including in-ear, on-ear, and/or over-ear headphones) and/or aportable playback device.

As described in further detail below, the processor(s) 1101A areconfigured to receive audio information 1105 via a first communicationlink (e.g., first communication link 512). An example of the audioinformation 1105 includes audio content provided from a variety ofcontent source providers (e.g., music streaming service providers,voice-assistant service providers, etc.). The audio information 1105 may(or may not) contain additional information associated with the audiocontent such as a playback time to facilitate synchronous playback withat least one other device. The processor(s) 1101A generate audioprocessing information for the audio content in the audio informationand communicate the audio content and the audio processing informationto the processor(s) 1101B. In turn, the processor(s) 1101B perform someor all of the audio processing operations and causes the audio contentto be played back using the audio processing information. The audioprocessing information may specify one or more parameters regarding howthe audio content should be played back. For example, the audioprocessing information may comprise normalization information such as avolume at which the audio content should be played. Other examples ofthe audio processing information include a presentation time that may,for example, be used as a basis for determining an amount delay to applyto the audio content to facilitate playing the audio content insynchrony with other playback devices (e.g., playing back the audiocontent in lip-synchrony with visual content played back by a television(or other video playback device) and/or playing back the audio contentin synchrony with playback of audio content by other playback devices).Yet other examples of the audio processing information includes a codecused by the content source provider to encode audio samples of the audiocontent.

As noted above, audio information 1105 may comprise audio contentreceived from different audio content source providers. For example,first audio content can correspond to audio streamed from an audiostreaming service (e.g., SPOTIFY). Second audio content can correspondto audio from an online assistant such as ALEXA by AMAZON. In anexample, the audio information 1105 (or any portion thereof) iscommunicated directly to the processor(s) 1101A from the various audiocontent source providers via the first communication link. In otherexamples, the audio information 1105 (or any portion thereof) iscommunicated to the processor(s) 1101A from another playback device(e.g., indirectly communicated from the service providers via one ormore other playback devices). In another example, a first portion of theaudio information 1105 (e.g., first audio content) is communicateddirectly to the playback device 102 from an audio content sourceprovider and a second portion of the audio information 1105 (e.g.,second audio content) is communicated from a different playback device.For example, first audio content associated with an online assistant canbe communicated directly to the processor(s) 1101A and second audiocontent associated with an audio streaming service can be communicatedfrom a different playback device to the processor(s) 1101A.

Some of the audio processing information may be associated withinformation specified by a user via a user device (e.g., control device104). For example, a user may, via the user device, select a particularaudio streaming service from which audio content should be streamed. Insome cases, the user may specify, via the user device, the volume atwhich the audio content should be played. In some examples, the user canspecify equalization information to be applied to the audio content suchas the bass and treble levels to be applied to the audio content.

As noted above, in some examples, some or all of the audio processingoperations are performed by the processor(s) 1101B. To facilitateperformance of the audio processing operations, the processors(s) 1101Acommunicates the audio content and corresponding audio processinginformation to the processor(s) 1101B. The corresponding audioprocessing information may be communicated to the processor(s) 1101B inthe form of metadata. The metadata associated with a given portion ofaudio content may specify the audio processing information associatedwith that given portion of the audio content. For example, as shown inFIG. 11, the metadata can specify a content source identifier (ID) thatfacilitates determining the source associated with the audio content. Anexample of the metadata specifies normalization information such as thevolume at which the audio content should be played. An example of themetadata specifies a presentation time that facilitates delaying theaudio content to support playback of the audio content in synchrony withother playback devices. An example of the metadata specifies a codecthat facilitates selecting a codec for decoding the audio content forplayback. Other information may be specified in the metadata.

In an example, the audio content and the corresponding metadata iscommunicated as a data stream 1110. The data stream 1110 may includepackets 1112, and each packet 1112 can include header data and payloaddata. The header data of the packet 1112 may specify the metadataassociated with particular audio content. For example, the header dataof a first packet (e.g., Header 1) may specify first metadata 1115Aassociated with first audio content provided by a first audio contentsource provider, and the payload data (e.g., Payload 1) of the firstpacket may specify portions or audio content portion 1120A associatedwith the first audio content. The header data of a second packet (e.g.,Header 2) may specify second metadata 1115B associated with second audiocontent provided by a second audio content source provider, and thepayload data (e.g., Payload 2) of the second packet may specify portionsor audio content portion 1120B associated with the second audio content.

In some examples, the data stream 1110 is bi-directional. For example,the packets 1112 can be communicated from the processor(s) 1101A to theprocessor(s) 1101B to facilitate playback of audio content received bythe processors(s) 1101A. Information may also be communicated from theprocessor(s) 1101B to the processors(s) 1101A. For example, a microphonemay be communicative coupled to the processor(s) 1101B that isconfigured to detect sound including voice utterances by a userassociated with a voice command such as a voice command to be directedto an online assistant (e.g., “Hey Alexa”). The voice utterance can beprocessed by the processors(s) 1101B and communicated to theprocessor(s) 1101A via any of the techniques described herein. Theprocessor(s) 1101A can forward the voice utterance to an appropriatesource provider for processing (e.g., Amazon's Alexa®).

In some examples, the metadata specifies the number of audio contentsamples specified within the payload data. For example, the metadata canspecify that 1, 2, 4, 8, 16, etc., samples are specified in the payloaddata. In some examples, the metadata specifies a channel associated withthe audio samples (e.g., left, right, front, back, center, subwoofer,etc.).

In some examples, the metadata specifies channel information. Thechannel information specified for a particular packet can specify thatthe samples within the payload data correspond to particular audiochannels and that the samples should be output in synchrony to specificamplifiers for playback. For example, the payload of a particular packetcan include two samples, one for a left channel and one for a rightchannel. The channel information can specify that the first samplecorresponds to the left channel and the second sample corresponds to theright channel. The same technique can be adapted to work with adifferent number of channels.

In some examples, the metadata data specifies target information. Thetarget information can specify a target device(s) or component to whichprocessed audio content or samples should be communicated. For example,the target information can specify that the audio content should becommunicated to another device via a second communication link (e.g.,communication link 518). The target information can specify that theprocessed audio content should be communicated via a BLUETOOTH link to aBLUETOOTH speaker. The target information can specify that the processedaudio content should be communicated via an 802.11 based link to adifferent playback device. For example, the processor(s) 1101B cancommunicate the processed audio content via the first communication linkto a different playback device.

It should be understood that the metadata specified within successivepackets can be different. This, in turn, facilitates multiplexing audiocontent communicated from different audio content source providers andcommunicating the multiplexed audio content to different amplifiersand/or devices. For example, first audio content can be communicated toan amplifier of the playback device and second audio content can becommunicated via a BLUETOOTH link to a BLUETOOTH speaker. The contentsource ID specified in the metadata facilitates assembling related audiocontent specified in the payload data from disparate/non-sequentialpackets.

FIG. 12 illustrates a logical representation of example processingoperations performed by the processors(s) 1101A and the secondprocessor(s) 1101B to facilitate distributed audio processing. Inparticular, processor logic 1210 may be implemented using, at least inpart, the processor(s) 1101A and the processor logic 1215 may beimplemented using, at least in part, the processor(s) 1101B. As shown,audio information 1105 is communicated to processor logic 1210. In anexample, the audio information includes 1105 audio content associatedwith N content source providers, which may be received from a commonsource or different sources. In an example, the processor logic 1210represents/maintains the audio content from these content sourceproviders as separate content streams (e.g., Content Stream 1-ContentStream N). Each content stream is partitioned into content portions andmetadata may be generated for at least some of the content portions. Insome instances, the content portions may correspond to, for example, aparticular number of audio content samples (e.g., 1, 2, 4, 8, etc.). Theprocessor logic 1210 includes content transmitter logic 1220 that isconfigured to associate content portions to be communicated withcorresponding metadata, and to output, for example, the data stream 1110of packets 1112 illustrated in FIG. 11.

In some examples, the processor logic 1210 may (e.g., for encoded audiocontent) compare the codec needed to decode the audio content with a setof one or more codecs (e.g., in a table stored in memory) that theprocessor logic 1215 supports. In these examples, the processor logic1210 may leave the audio content in an encoded state (e.g., fortransmission in the data stream 1110) when the codec needed to decodethe audio content matched a coded in the set of one or more codecssupported by the processor logic 1215. Otherwise, the processor logic1210 may, for example, decode the audio content (e.g., to anuncompressed format such as pulse-code modulation (PCM) format) prior totransmission in the data stream 1110. Alternatively, the processor logic1210 may re-code the audio content in a different format that issupported by the processor logic 1215. As a result, the processor logic1215 may be capable of rendering audio from a wider range of sourcesthan would otherwise be possible.

The processor logic 1215 includes content receiver logic 1225 configuredto receive, for example, the data stream 1110 of packets 1112 and toseparate audio content of the data stream 1110 according to the metadatainto separate content streams (e.g., Recreated Content Stream1-Recreated Content Stream N). In some examples, the processor logic1215 includes a buffer and the buffer is filled with audio samplesaccording to a particular clock rate. This clock rate can be adjustedbased on a rate at which the audio content is communicated from theprocessors(s) 1101A. For example, the clock rate can be dynamicallyadjusted to prevent a buffer underrun or buffer overflow condition fromoccurring within the content receiver logic 1225.

In another example, the content receiver logic 1225 can be configured toinsert/duplicate samples of audio content in the buffer when the rate atwhich audio samples arrive from the processor(s) 1101A is below aprocessing rate of the processor(s) 1101B. In addition, the contentreceiver logic 1225 can be configured to drop or overwrite samples ofaudio content in the buffer when the rate at which audio samples arrivefrom the processor(s) 1101A is above a processing rate of theprocessor(s) 1101B.

The recreated content streams output from the content receiver logic1225 are communicated to audio content processing logic 1230 of theprocessor logic 1215. The audio content processing logic 1230 isconfigured to process the audio content of a particular recreatedcontent stream according information specified in the metadataassociated with that stream (and/or additional information that may beavailable locally such as user preference information (e.g., indicatinga preference as to how content streams are mixed, volume preferences,etc.)). For example, the audio content processor logic 1230 may performone or more of the following operations: (1) mix audio content frommultiple recreated streams; (2) decode the content portions (e.g., usinga decoder as specified by the metadata); and/or (3) perform asynchronoussample rate conversion to so as to synchronize playback with anothersource (e.g., to playback content at the correct rate).

It should be appreciated that the architecture shown in FIGS. 11 and 12may be extended from a one-to-one relationship (e.g., between processors1101A/processor logic 1210 and processors 1101B/processor logic 1215) toother types of relationships including: (1) a one-to-many relationship;(2) a many-to-one relationship; and (3) a many-to-many relationship. Forexample (e.g., in some one-to-many relationships), multiplebattery-powered playback devices may each comprise processor(s) 1101Band perform functions associated processor logic 1215 based on datastreams from a single stationary playback device (e.g., a soundbar) thatcomprises processor(s) 1101A that perform functions associated withprocessor logic 1210. In another example (e.g., in some many-to-onerelationships), a single battery-powered playback device may compriseprocessor(s) 1101B and perform functions associated processor logic 1215based on multiple data streams from multiple stationary playback devicesthat each comprise processor(s) 1101A that perform functions associatedwith processor logic 1210. In this example, a first stationary playbackdevice may generate a first data stream that includes a first contentstream and the second stationary playback device may generate a seconddata stream that includes second content stream. In yet another example(e.g., in some many-to-many relationships), multiple battery-poweredplayback devices may each comprise processor(s) 1101B and performfunctions associated processor logic 1215 based on data streams frommultiple stationary playback devices that each comprises processor(s)1101A that perform functions associated with processor logic 1210. Inthis example, a first stationary playback device may generate a firstdata stream for all of the battery-powered playback devices thatincludes a first content stream and the second stationary playbackdevice may generate a second data stream for all of the battery-poweredplayback devices that includes a second content stream.

In instances where the processor logic 1215 employs metadata and someadditional information (e.g., locally available information) to processthe content streams, it should be understood that multiple instances ofthe processor logic 1215 (e.g., in multiple difference devices) mayprocess the same content stream differently. For example, a firstinstance of the processor logic 1215 (e.g., in a first playback device)may take into first user preference information associated with a firstuser and a second instance of the processor logic 1215 (e.g., in asecond playback device) may take into second user preference informationassociated with a second, different user (that is different from thefirst user preference information). In this example, the first instanceof the processor logic 1215 may render the same content streamdifferently than the second instance of the processor logic 1215 to takeinto account the differences of user preference between the respectiveusers.

FIG. 13 illustrates examples of operations performed by (e.g., by one ormore playback devices) to facilitate distributed audio processing. Itshould be noted that the operations performed in the figure may beimplemented by one or more components of one or more playback devices,such as the first processor(s) 524 and the second processor(s) 526described above with respect to FIG. 5. In this regard, instruction codecan be stored in a memory of the one or more playback devices andexecuted by the first processor(s) 524 and/or the second processor(s)526 to cause the first processor(s) 524 and the second processor(s) 526alone or in combination with other components of the one or moreplayback devices to perform these operations.

At block 1300, audio information 1105 is received. For example, thesecond processor(s) 526 may be configured to receive audio information1105 via the first communication link 512. The audio information 1105can include audio content provided from, for example, one or morecontent source providers. The audio information 1105 may compriseadditional information such as presentation time to facilitatesynchronous playback.

At block 1305, metadata associated with the audio content specified inthe audio information 1105 is generated (e.g., using the secondprocessor(s) 526). For example, as noted above, the metadata can specifya content source ID, normalization information, a presentation time, anda codec associated with audio content specified in the audio information1105. Other examples of the metadata can further specify the number ofsamples specified in payload data of a data stream, channel informationassociated with the audio samples, and target information that specifiesa target device or component to which processed audio content or samplesshould be communicated.

In instances where the metadata comprises a presentation time, it shouldbe appreciated that the presentation time may be generated in any of avariety of ways. In some embodiments, the audio information received atblock 1305 may comprise a first presentation time. In these embodiments,a second presentation time that is incorporated into the metadata may bederived from the presentation received in the audio information. Forexample, the second presentation time may be identical to the firstpresentation time. In other examples, the second presentation time maybe generated based on the first presentation time and clock timinginformation (e.g., a difference in clock times between a clock on theplayback device that is to render the audio content and a referenceclock used for synchronization). In other embodiments, the audioinformation received at block 1305 may not comprise a presentation time.In these embodiments, the presentation time incorporated into themetadata may be generated based on, for example, a local clock time(e.g., associated with a physical clock and/or a virtual clock).

At block 1310, the audio content and metadata can be packetized into adata stream 1110 (e.g., using second processor(s) 526). Each packet 1112may include header data and payload data. The header data of the packet1112 may specify the metadata associated with particular audio contentand the payload data of the packet 1112 may specify portions of audiocontent samples associated with the audio content.

At block 1315, the data stream 1110 is communicated (e.g., to the firstprocessor(s) 524). For example, the second processor(s) 526 cancommunicate the data stream 1110 to the first processor(s) 524 via adata bus such as a SPI bus, an I2C bus, etc. In another example, thesecond processor(s) 526 can communicate the data stream 1110 to thefirst processor(s) 524 by storing the data stream 1110 in a memorylocation that is accessible to the first processor(s) 524 (e.g., storedin a shared memory). In instances where the second processor(s) 526 andthe first processor(s) 524 are housed in different devices (e.g.,different playback devices), the data stream 1110 may be communicatedvia a communication network, such as a WLAN and/or a PAN, from thesecond processor(s) 526 to the first processor(s) 524.

At block 1320, the audio content specified in the data stream 1110 isprocessed according to metadata associated with the audio content (e.g.,using the first processor(s) 524). For example, audio content processinglogic 1230 of the first processor(s) 526 can decode audio contentencoded with a particular codec by corresponding codec logic implementedby the audio content processing logic 1230. In an example, thedetermination of the codec logic to use in decoding the audio content isbased on the codec specified in the metadata. In another example, theaudio content processing logic 1230 can normalize the audio contentaccording to normalization information specified in the metadata. Inanother example, the audio content processing logic 1230 can delay thepresentation/outputting of the audio content according to a presentationtime specified in the metadata. It should be appreciated that the audiocontent in the data stream 1110 may also be processed based onadditional information such as, for example, user preference information(e.g., specifying preferences as to how content is mixed, volumesettings, etc.)

At block 1325, the processed audio content is communicated to theamplifiers for playback (e.g., using the first processor(s) 524).Additionally (or alternatively), in some examples, the processed audiocontent is communicated to a target component and/or device forplayback. The target component and/or device can be specified in themetadata associated with the audio content.

FIG. 14 illustrates another example of operations performed by one ormore playback devices to facilitate distributed audio processing. Itshould be noted that the operations performed in the figure may beimplemented by one or more components of one or more playback devicessuch as the first processor(s) 524 and the second processor(s) 526. Inthis regard, instruction code can be stored in a memory of the one ormore playback devices and executed by the first processor(s) 524 and/orthe second processor(s) 526 to cause the first processor(s) 524 and thesecond processor(s) 526 alone or in combination with other components ofthe one or more playback devices to perform these operations.

At block 1400, audio information 1105 is received, and at block 1405,metadata associated with the audio content in the audio information 1105is generated (e.g., using the second processor(s) 526). Additionaldetails of the operations performed at blocks 1400 and 1405 aredescribed in connection with blocks 1300 and 1305, respectively, of FIG.13. The additional details are not repeated here for brevity.

At block 1410, the metadata and the audio content are stored (e.g.,using the second processor(s) 526) to a data storage (e.g., data storage506). In one example, the entire duration of audio content and metadataassociated with the entire duration of the audio content is stored tothe data storage. For example, the audio content and metadata associatedwith an entire music track can be stored to the data storage. In anotherexample, a portion of the audio content (e.g., five seconds) andmetadata associated with the portion of the audio content is stored tothe data storage.

At block 1415, one or more processor(s) (e.g., second processor(s) 526)are transitioned to a sleep state. For example, the processor(s)employed to store the metadata and audio content in the data storage(e.g., the second processor(s) 526) may be transitioned to a light sleepstate or a deep sleep state after storing the entire length of the audiocontent or the portion of the audio content. Transitioning of the one ormore processor(s) to the sleep state facilitates lowering of the powerconsumption of the one or more playback devices.

At block 1420, the metadata and the audio content are read from the datastorage (e.g., using the first processor(s) 524).

At block 1425, the audio content is processed according to the metadata(e.g., using the first processor(s) 524). At block 1430, the processedaudio content is communicated (e.g., using the first processor(s) 524)to the amplifiers, other components, and/or devices. Additional detailsof the operations performed at blocks 1425 and 1430 are described inconnection with blocks 1320 and 1325, respectively.

In examples, where only portions of the audio content are stored (e.g.,five seconds worth of audio content), the operations at blocks 1410through 1430 can be repeated. For example, a one minute track can bebroken into 12 five-second portions. After each five-second portion ofaudio content is stored, the second processors(s) 526 can betransitioned to a sleep state. The first processor(s) 524 can processthe five-second portion of audio content as described above. Afterprocessing or just before processing is finished, the first processor(s)524 can send an indication to the second processors(s) 526 to indicatethat the first processor(s) 524 are ready to process the next portion ofthe audio content. The second processor(s) 526 can then transition tothe awake state, and store the next five-second portion of audiocontent. This process can repeat until all the portions of the audiocontent have been processed. These techniques can be used where datastorage is at a premium. That is, where the entire length of audiocontent cannot be effectively stored in the data storage. Thesetechniques can also be used to minimize the delay between when the audiocontent is received by the one or more playback devices and when theprocessed audio content is output from the one or more playback devices.

VI. Conclusion

The description above discloses, among other things, various examplesystems, methods, apparatus, and articles of manufacture including,among other components, firmware and/or software executed on hardware.It is understood that such examples are merely illustrative and shouldnot be considered as limiting. For example, it is contemplated that anyor all of the firmware, hardware, and/or software aspects or componentscan be embodied exclusively in hardware, exclusively in software,exclusively in firmware, or in any combination of hardware, software,and/or firmware. Accordingly, the examples provided are not the onlyway(s) to implement such systems, methods, apparatus, and/or articles ofmanufacture.

It should be appreciated that references to transmitting information toparticular components, devices, and/or systems herein should beunderstood to include transmitting information (e.g., messages,requests, responses) indirectly or directly to the particularcomponents, devices, and/or systems. Thus, the information beingtransmitted to the particular components, devices, and/or systems maypass through any number of intermediary components, devices, and/orsystems prior to reaching its destination. For example, a control devicemay transmit information to a playback device by first transmitting theinformation to a computing system that, in turn, transmits theinformation to the playback device. Further, modifications may be madeto the information by the intermediary components, devices, and/orsystems. For example, intermediary components, devices, and/or systemsmay modify a portion of the information, reformat the information,and/or incorporate additional information.

Similarly, references to receiving information from particularcomponents, devices, and/or systems herein should be understood toinclude receiving information (e.g., messages, requests, responses)indirectly or directly from the particular components, devices, and/orsystems. Thus, the information being received from the particularcomponents, devices, and/or systems may pass through any number ofintermediary components, devices, and/or systems prior to beingreceived. For example, a control device may receive information from aplayback device indirectly by receiving information from a cloud serverthat originated from the playback device. Further, modifications may bemade to the information by the intermediary components, devices, and/orsystems. For example, intermediary components, devices, and/or systemsmay modify a portion of the information, reformat the information,and/or incorporate additional information.

The specification is presented largely in terms of illustrativeenvironments, systems, procedures, steps, logic blocks, processing, andother symbolic representations that directly or indirectly resemble theoperations of data processing devices coupled to networks. These processdescriptions and representations are typically used by those skilled inthe art to most effectively convey the substance of their work to othersskilled in the art. Numerous specific details are set forth to provide athorough understanding of the present disclosure. However, it isunderstood to those skilled in the art that certain embodiments of thepresent disclosure can be practiced without certain, specific details.In other instances, well known methods, procedures, components, andcircuitry have not been described in detail to avoid unnecessarilyobscuring aspects of the embodiments. Accordingly, the scope of thepresent disclosure is defined by the appended claims rather than theforgoing description of embodiments.

When any of the appended claims are read to cover a purely softwareand/or firmware implementation, at least one of the elements in at leastone example is hereby expressly defined to include a tangible,non-transitory medium such as a memory, DVD, CD, Blu-ray, and so on,storing the software and/or firmware.

VII. Example Features

(Feature 1) A playback device comprising: one or more amplifiersconfigured to drive one or more speakers; one or more network interfacecomponents; a plurality of processing components comprising: one or morefirst processors configured to execute at least one real-time operatingsystem (RTOS); one or more second processors configured to execute atleast one general-purpose operating system (GPOS), wherein the one ormore second processors have a different construction than the one ormore first processors, wherein the one or more second processors have aplurality of power states including a first power state and a secondpower state, wherein the one or more second processors consume morepower in the second power state than in the first power state; datastorage having stored therein instructions executable by the pluralityof processing components to cause the playback device to perform amethod comprising: causing, using the one or more first processors, theone or more second processors to transition from the first power stateto the second power state; after causing the one or more secondprocessors to transition from the first power state to the second powerstate, obtaining, using the one or more second processors, first audiocontent from at least one remote server over a wide area network (WAN)via the one or more network interface components; providing, using theone or more second processors, the one or more first processors accessto the first audio content; and playing, using the one or more firstprocessors, the first audio content via the one or more amplifiers.

(Feature 2) The playback device of feature 1, wherein the method furthercomprises: causing, using the one or more first processors, the secondprocessor to transition from the second power state to the first powerstate; obtaining, using the one or more first processors, second audiocontent from a computing device over a personal area network (PAN) viathe one or more network components; and playing, using the one or morefirst processors, the second audio content via the one or moreamplifiers.

(Feature 3) The playback device of any of features 1 and 2, wherein theone or more first processors include a processor not configured tosupport virtual memory and wherein the one or more second processorsinclude a processor configured to support virtual memory.

(Feature 4) The playback device of any of features 1-3, wherein the oneor more first processors include at least one general-purpose processor(GPP) and at least one single-purpose processor (SPP).

(Feature 5) The playback device of feature 4, wherein the at least oneGPP comprises a reduced instruction set computer (RISC) processor andthe at least one SPP comprises a digital signal processor (DSP).

(Feature 6) The playback device of feature any of features 1-5, whereinthe GPOS is an operating system based on a LINUX kernel.

(Feature 7) The playback device of any of features 1-6, wherein the datastorage comprises: a first memory only directly accessible by the one ormore first processors; and a second memory only directly accessible bythe one or more second processors.

(Feature 8) The playback device of feature 7, wherein providing the oneor more first processors access to the first audio content comprising:transmitting the first audio content from the one or more secondprocessors to the one or more first processors via a communication bus.

(Feature 9) The playback device of any of features 1-6, wherein the datastorage comprises a shared memory that is directly accessible by the oneor more first processors and the one or more second processors.

(Feature 10) The playback device of feature 9, wherein providing the oneor more first processors access to the first audio content comprises:storing, using the one or more second processors, the first audiocontent in the shared memory.

(Feature 11) The playback device of any of features 1-10, furthercomprising a rechargeable battery.

(Feature 12) The playback device of any of features 1-11, furthercomprising a housing configured to be worn about a portion of a subject.

(Feature 13) An Internet-of-Things (IoT) device comprising: anelectronic component; one or more network interface components; aplurality of processing components comprising: one or more firstprocessors configured to execute at least one real-time operating system(RTOS); one or more second processors configured to execute at least onegeneral-purpose operating system (GPOS), wherein the one or more secondprocessors have a different construction than the one or more firstprocessors, wherein the one or more second processors have a pluralityof power states including a first power state and a second power state,wherein the one or more second processors consume more power in thesecond power state than in the first power state; data storage havingstored therein instructions executable by the plurality of processingcomponents to cause the playback device to perform a method comprising:causing, using the one or more first processors, the one or more secondprocessors to transition from the first power state to the second powerstate; after causing the one or more second processors to transitionfrom the first power state to the second power state, obtaining, usingthe one or more second processors, information from at least one serverover a wide area network (WAN) via the one or more network interfacecomponents; and controlling, using the one or more first processors,operation of the electronic component based on the retrievedinformation.

(Feature 14) The IoT device of feature 13, wherein the electroniccomponent comprises at least one of: a display, an electric motor, aheating element, a switch, a speaker, a light, or a sensor.

(Feature 15) The IoT device of feature 14, wherein the electroniccomponent is the speaker, wherein the information comprises audiocontent, and wherein controlling operation of the electronic componentcomprises driving the speaker to reproduce the audio content.

(Feature 16) The IoT device of feature 14, wherein the electroniccomponent is the sensor and wherein controlling operation of theelectronic component based on the information comprises reading at leastone sensor value from the sensor.

(Feature 17) A module (e.g., a circuit board assembly) for a playbackdevice, the module comprising: at least one circuit board; one or moreinternal network interface components attached to the at least onecircuit board; a plurality of processing components attached to the atleast one circuit board, wherein the plurality of processing componentscomprises: one or more first processors configured to execute at leastone real-time operating system (RTOS); one or more second processorsconfigured to execute at least one general-purpose operating system(GPOS), wherein the one or more second processors have a differentconstruction than the one or more first processors, wherein the one ormore second processors have a plurality of power states including afirst power state and a second power state, wherein the one or moresecond processors consume more power in the second power state than inthe first power state; data storage attached to the at least one circuitboard and having stored therein instructions executable by the pluralityof processing components to cause the playback device to perform amethod comprising: causing, using the one or more first processors, theone or more second processors to transition from the first power stateto the second power state; after causing the one or more secondprocessors to transition from the first power state to the second powerstate, obtaining, using the one or more second processors, first audiocontent from at least one remote server over a wide area network (WAN)via the one or more network interface components; providing, using theone or more second processors, the one or more first processors accessto the first audio content; and playing, using the one or more firstprocessors, the first audio content via one or more amplifiersconfigured to drive one or more speakers.

(Feature 18) The module of feature 17, wherein the method furthercomprises: causing, using the one or more first processors, the secondprocessor to transition from the second power state to the first powerstate; obtaining, using the one or more first processors, second audiocontent from a computing device over a personal area network (PAN) viathe one or more network components; and playing, using the one or morefirst processors, the second audio content via the one or moreamplifiers.

(Feature 19) The module of any of features 17 and 18, further comprisingthe one or more amplifiers and wherein the one or more amplifiers areattached to the at least one circuit board.

(Feature 20) The module of any of features 17-19, further comprising oneor more power components attached to the at least one circuit board,wherein the one or more power components are configured to receive powerfrom a power source and distribute power to at least the plurality ofprocessing components.

(Feature 21) The module of any of features 17-20, wherein the one ormore first processors are integrated into a first circuit die that isattached to the at least one circuit board and wherein the one or moresecond processors are integrated into a second circuit die that isseparate from the first circuit die and attached to the at least onecircuit board.

(Feature 22) The module of any of features 17-20, wherein the one ormore first processors and the one or more second processors areintegrated into a single circuit die that is attached to the at leastone circuit board the at least one circuit board.

(Feature 23) A headphone device comprising: at least one earpiece; oneor more amplifiers configured to drive one or more speakers; one or morenetwork interface components configured to facilitate communication overat least one data network; at least one processor; at least onenon-transitory computer-readable medium comprising program instructionsthat are executable by the at least one processor such that theheadphone assembly is configured to: receive, using the one or morenetwork interface components, first audio content and first metadataassociated with the first audio content from at least one playbackdevice; receive, using the one or more network interface components,second audio content and second metadata associated with the secondaudio content from the at least one playback device; based on at leastone of the first metadata or the second metadata, generate, using the atleast one processor, mixed audio content that comprises at least some ofthe first audio content and at least some of the second audio content;and play back, using the one or more first processors and the one ormore audio amplifiers, the mixed audio content.

(Feature 24) The headphone device of feature 23, wherein the first audiocontent and the first metadata associated with the first audio contentis received from a first playback device and wherein the second audiocontent and the second metadata associated with the second audio contentis received from a second playback device that is different from thefirst playback device.

(Feature 25) The headphone device of features 23, wherein the firstaudio content, the first metadata associated with the first audiocontent, the second audio content, and the second metadata associatedwith the second audio content are all received from a single playbackdevice.

(Feature 26) The headphone device of any of features 23-25, wherein theat least one processor does not comprise an application processor (e.g.,a processor capable of executing a general-purpose operating system(GPOS) that supports memory virtualization).

(Feature 27) The headphone device of any of features 23-26, wherein theat least one processors comprises a digital signal processor (DSP).

(Feature 28) The headphone device of any of feature 23-27, wherein theheadphone device comprises at least one microphone and wherein theheadphone device is a hearable device configured to play back anamplified version of at least some sound detected using the at least onemicrophone.

(Feature 29) A playback device comprising: one or more amplifiersconfigured to drive one or more speakers; one or more network interfacecomponents configured to facilitate communication over at least one datanetwork; at least one processor; and at least one non-transitorycomputer-readable medium comprising program instructions that areexecutable by the at least one processor such that the headphoneassembly is configured to: receive, using the one or more networkinterface components, first audio content from at least one cloud servervia at least one first network (e.g., a WLAN); generate first metadataassociated with the first audio content based on the first audiocontent; and transmit, using the one or more network interfacecomponents, the first audio content and the first metadata associatedwith the first audio content to at least one first playback device viaat least one second network (e.g., a PAN) that is different from the atleast one first network.

(Feature 30) The playback device of feature 29, wherein the first audiocontent and the first metadata associated with the first audio contentare transmitted without the playback device playing back the first audiocontent.

(Feature 31) The playback device of any of features 29-30, wherein theprogram instructions that are executable by the at least one processorsuch that the playback device is configured to: generate second metadataassociated with the first audio content (e.g., that may be differentfrom the first metadata); and transmit the first audio content and thesecond metadata associated with the first audio content to at least onesecond playback device via the at least one first network.

(Feature 32) The playback device of feature 31, wherein the at least onefirst playback device comprises a headphone device and wherein the atleast one second playback device comprises a stationary playback device.

(Feature 33) The playback device of any of features 29 and 31-32,wherein the program instructions that are executable by the at least oneprocessor such that the playback device is configured to playback thefirst audio content in synchrony with the at least one first playbackdevice.

(Feature 34) A playback device comprising: one or more amplifiersconfigured to drive one or more speakers; one or more network interfacecomponents configured to facilitate communication over at least one datanetwork; a plurality of processing components comprising: one or morefirst processors; and one or more second processors having a differentconstruction than the one or more first processors, wherein the one ormore second processors have a plurality of power states including afirst power state (e.g., an awake state) and a second power state (e.g.,a sleep state such as a light sleep state or a deep sleep state),wherein the one or more second processors consume more power in thefirst power state than in the second power state; at least onenon-transitory computer-readable medium comprising program instructionsthat are executable by the plurality of processing components such thatthe playback device is configured to: while the one or more secondprocessors are in the second power state, (i) receive, using the one ormore network interface components, audio content from a user device viaa first network, (ii) play back, using the one or more first processorsand the one or more amplifiers, the audio content, and (iii) detect thata connection to a second network can be established; after detectionthat the connection to the second network can be established, (i) causethe one or more second processors to transition from the second powerstate to the first power state, and (ii) establish a connection to thesecond network; and while the one or more second processors are in thefirst power state and the playback device is connected to the secondnetwork, (i) receive, using the one or more network interfacecomponents, second audio content from at least one remote computingdevice, and (ii) play back, using the one or more second processors andthe one or more amplifiers, the second audio content.

(Feature 35) The playback device according to feature 34, wherein thesecond network corresponds to an Institute of Electrical and ElectronicsEngineers (IEEE) 802.11 based network, and wherein the programinstructions that are executable by the plurality of processingcomponents such that the playback device is configured to detect that aconnection to the second network can be established comprises programinstructions that are executable by the plurality of processingcomponents such that the playback device is configured to: detect aservice set identifier (SSID) associated with the second network matchesa predetermined SSID.

(Feature 36) The playback device according to any one of features 34-35,wherein the at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:detect that the connection to the second network has been lost; andafter detection that the connection to the second network has been lost,cause the one or more second processors to transition from the firstpower state to the second power state.

(Feature 37) The playback device according to any one of features 34-36,wherein the at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:before receipt of the audio content from the user device, (i) initializethe one or more second processors and (ii) communicate, using the one ormore second processors, initialization instruction code to the one ormore first processors to facilitate initialization of the one or morefirst processors.

(Feature 38) The playback device according to any one of features 34-37,wherein the at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:cause the one or more second processors to transition to the secondpower state after a predetermined amount of time of inactivity.

(Feature 39) The playback device according to any one of features 34-38,wherein the at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:after the one or more second processors transition to the second powerstate, cause the one or more first processors to transition to a lowpower state after a predetermined amount of time of inactivity.

(Feature 40) The playback device according to any one of features 34-39,wherein the first network corresponds to a BLUETOOTH network.

(Feature 41) The playback device according to any one of features 34-40,wherein the program instructions that are executable by the plurality ofprocessing components such that the playback device is configured toplay back the second audio content comprises program instructions thatare executable by the plurality of processing components such that theplayback device is configured to: play back the second audio contentusing the one or more first processors, the one or more secondprocessors, and the one or more amplifiers.

(Feature 42) The playback device according to any one of features 34-41,wherein the playback device includes a rechargeable battery, and whereinthe at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:while the rechargeable battery is being recharged, cause the one or moresecond processors to transition to the second power state.

(Feature 43) The playback device according to any one of features 34-42,wherein the playback device comprises a control area configured toreceive a user control indication, and wherein the at least onenon-transitory computer-readable medium further comprises programinstructions that are executable by the plurality of processingcomponents such that the playback device is configured to: after receiptof a user control indication, cause the one or more second processors totransition to the first power state.

(Feature 44) The playback device according to any one of features 34-43,wherein the playback device is a headphone device and further comprisesat least one earpiece.

(Feature 45) The playback device according to feature 44, wherein theheadphone device comprises at least one microphone and wherein theheadphone device is a hearable device configured to play back anamplified version of at least some sound detected using the at least onemicrophone.

(Feature 46) The playback device according to any one of feature 34-45,wherein the playback device is a screenless playback device that doesnot comprise a display screen.

(Feature 47) The playback device according to any one of features 34-46,wherein the one or more second processors comprise an applicationprocessor and/or wherein the one or more first processors do notcomprise an application processor.

(Feature 48) A playback device comprising: one or more amplifiersconfigured to drive one or more speakers; one or more network interfacecomponents configured to facilitate communication over at least one datanetwork; a plurality of processing components comprising: one or morefirst processors; and one or more second processors having a differentconstruction than the one or more first processors; at least onenon-transitory computer-readable medium comprising program instructionsthat are executable by the plurality of processing components such thatthe playback device is configured to: receive, using the one or morenetwork interface components, audio information that comprises at leastfirst audio content; generate, using the one or more second processors,at least first metadata associated with the first audio content;communicate, using the one or more second processors, the first audiocontent and the first metadata to the one or more first processors; andplay back, using the one or more first processors and the one or moreaudio amplifiers, the first audio content based on the first metadata.

(Feature 49) The playback device according to feature 48, wherein thefirst metadata specifies one or more of: normalization information or acodec associated with the first audio content.

(Feature 50) The playback device according to feature 49, wherein thefirst metadata specified the codec associated with the first audiocontent, and wherein the program instructions that are executable by theplurality of processing components such that the playback device isconfigured to play back the first audio content based on the firstmetadata comprises program instructions that are executable by theplurality of processing components such that the playback device isconfigured to: decode the first audio content using the codec identifiedin the first metadata associated with the first audio content.

(Feature 51) The playback device according to any one of features 48-50,wherein the first metadata specifies a presentation time at which thefirst audio content is to be played back by the playback device tofacilitate playing the first audio content in synchrony with otherplayback devices.

(Feature 52) The playback device according to any one of features 48-51,wherein the program instructions that are executable by the plurality ofprocessing components such that the playback device is configured tocommunicate the first audio content and the first metadata to the one ormore first processors comprises program instructions that are executableby the plurality of processing components such that the playback deviceis configured to: store, using the one or more second processors,portions of the first audio content and first metadata associated withthe portions of the first audio content to data storage of the playbackdevice; and read, using the one or more first processors, the portionsof the first audio content and the first metadata associated with theportions of the first audio content from the data storage for playbackvia the one or more amplifiers.

(Feature 53) The playback device according to any one of features 48-52,wherein the program instructions that are executable by the plurality ofprocessing components such that the playback device is configured tocommunicate the first audio content and the first metadata to the one ormore first processors comprises program instructions that are executableby the plurality of processing components such that the playback deviceis configured to: store, using the one or more second processors, anentire length of audio content associated with the first audio contentand first metadata associated with the entire length of the first audiocontent to data storage of the playback device; transition the one ormore second processors to a low power state; and read (e.g., at least inpart while the one or more second processors are in the low powerstate), using the one or more first processors, the entire length of thefirst audio content and corresponding first metadata associated with thefirst audio content from the data storage for playback via the one ormore amplifiers.

(Feature 54) The playback device according to any one of features 48-53,wherein the audio information comprises second audio content, andwherein the at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:generate, by the one or more second processors, second metadataassociated with the second audio content; combine the first audiocontent, the first metadata, the second audio content, and the secondmetadata into a data stream; wherein in communicating the first audiocontent and the first metadata to the one or more second processors, theinstructions are executable by the plurality of processing components tocause the playback device to: communicate, using the one or more secondprocessors, the data stream to the one or more first processors.

(Feature 55) The playback device according to feature 54, wherein thedata stream comprises a plurality of packets of information, whereinheader data for each packet specifies one of the first metadata or thesecond metadata, and payload data of each packet specifies one of aportion of the first audio content or a portion of the second audiocontent that corresponds to the first metadata or the second metadataspecified in the header data.

(Feature 56) The playback device according to feature 55, wherein the atleast one non-transitory computer-readable medium further comprisesprogram instructions that are executable by the plurality of processingcomponents such that the playback device is configured to: process, bythe one or more first processors, audio content specified in the payloadof each packet according to metadata specified in the header of thepacket.

(Feature 57) The playback device according to feature 54, wherein thesecond audio content is communicated to the playback device in responseto a voice command, and wherein the at least one non-transitorycomputer-readable medium further comprises program instructions that areexecutable by the plurality of processing components such that theplayback device is configured to: receive, by the one or more firstprocessors and via a microphone associated with the playback device,third audio content associated with the voice command; communicate, bythe one or more first processors, the third audio content to the one ormore second processors; and communicate, by the one or more secondprocessors, the third audio content to a server for processing a commandassociated with the third audio content.

(Feature 58) The playback device according to any one of features 48-57,wherein the at least one non-transitory computer-readable medium furthercomprises program instructions that are executable by the plurality ofprocessing components such that the playback device is configured to:adjust, by the one or more first processors, a clock rate associatedwith processing of the first audio content based on a rate at which thefirst audio content is communicated to the one or more first processorsby the one or more second processors.

(Feature 59) The playback device according to any one of features 48-58,wherein the playback device is a headphone device and further comprisesat least one earpiece.

(Feature 60) The playback device according to any one of features 48-59,wherein the headphone device comprises at least one microphone andwherein the headphone device is a hearable device configured to playback an amplified version of at least some sound detected using the atleast one microphone.

(Feature 61) The playback device according to any one of features 48-60,wherein the playback device is a screenless playback device that doesnot comprise a display screen.

(Feature 62) The playback device according to any one of features 48-61,wherein the one or more second processors comprise an applicationprocessor and/or wherein the one or more first processors do notcomprise an application processor.

(Feature 63) A method performed by a playback device comprising aplurality of processing components including one or more firstprocessors and one or more second processors that have a differentconstruction than the one or more first processors, the methodcomprising: receiving audio information that comprises at least firstaudio content; generating, using the one or more second processors,first metadata associated with the first audio content; communicating,using the one or more second processors, the first audio content and thefirst metadata to the one or more first processors; and playing back,using the one or more first processors, the first audio content based onthe first metadata.

(Feature 64) The method according to feature 63, wherein the firstmetadata specifies one or more of: normalization information or a codecassociated with the first audio content.

(Feature 65) The method according to feature 64, wherein the firstmetadata specified the codec associated with the first audio content,and wherein playing back the first audio content based on the firstmetadata comprises: decoding the first audio content using the codecspecified in the first metadata.

(Feature 66) The method according to any one of features 63-65, whereinthe first metadata specifies a presentation time at which the firstaudio content is to be played back by the playback device to facilitateplayback of the first audio content in synchrony with at least one otherplayback device.

(Feature 67) The method according to any one of features 63-66, whereincommunicating the first audio content and the first metadata to the oneor more first processors comprises: storing, using the one or moresecond processors, portions of the first audio content and firstmetadata associated with the portions of the first audio content to datastorage of the playback device; and reading, using the one or more firstprocessors, the portions of the first audio content and the firstmetadata associated with the portions of the first audio content fromthe data storage for playback via the one or more amplifiers.

(Feature 68) The method according to any one of features 63-67, whereincommunicating the first audio content and the first metadata to the oneor more first processors comprises: storing, using the one or moresecond processors, an entire length of audio content associated with thefirst audio content and first metadata associated with the entire lengthof the first audio content to data storage of the playback device;transitioning the one or more second processors to a low power state;and reading, using the one or more first processors, the entire lengthof the first audio content and corresponding first metadata associatedwith the first audio content from the data storage for playback via theone or more amplifiers.

(Feature 69) The method according to any one of features 63-68, whereinthe audio information comprises second audio content, and wherein themethod further comprises: generating, by the one or more secondprocessors, second metadata associated with the second audio content;and combining the first audio content, the first metadata, the secondaudio content, and the second metadata into a data stream.

(Feature 70) The method according to feature 69, wherein communicatingthe first audio content and the first metadata to the one or more firstprocessors comprises: communicating, using the one or more secondprocessors, the data stream to the one or more first processors.

(Feature 71) The method according to feature 70, wherein the data streamcomprises a plurality of packets of information, wherein header data foreach packet specifies one of the first metadata or the second metadata,and payload data of each packet specifies one of a portion of the firstaudio content or a portion of the second audio content that correspondsto the first metadata or the second metadata specified in the headerdata.

(Feature 72) The method according to feature 71, further comprising:processing, by the one or more first processors, audio content specifiedin the payload of each packet according to metadata specified in theheader of the packet.

(Feature 73) The method according to feature 69, wherein the secondaudio content is communicated to the playback device in response to avoice command, and wherein the method further comprises: receiving, bythe one or more first processors and via a microphone associated withthe playback device, third audio content associated with the voicecommand; communicating, by the one or more first processors, the thirdaudio content to the one or more second processors; and communicating,by the one or more second processors, the third audio content to aserver for processing a command associated with the third audio content.

(Feature 74) The method according to any one of features 63-73, furthercomprising: adjusting, by the one or more first processors, a clock rateassociated with processing of the first audio content based on a rate atwhich the first audio content is communicated to the one or more firstprocessors by the one or more second processors.

(Feature 75) The method according to any one of features 63-74, whereinthe playback device is a headphone device.

(Feature 76) The method according to feature 75, wherein the headphonedevice is a hearable device and wherein the method further comprises:detecting external sound using at least one microphone; and playing backan amplified version of at least some of the detected external sound.

(Feature 77) The method according to any one of features 63-76, whereinthe one or more second processors have a plurality of power statesincluding a first power state and a second power state, wherein the oneor more second processors consume more power in the first power statethan in the second power state and wherein the method further comprises:while the one or more second processors are in the second power state,(i) receive audio content from a user device, (ii) play back, using theone or more first processors, the audio content, and (iii) detect that aconnection to a wireless local area network (WLAN) can be established.

(Feature 78) The method according to feature 77, further comprising:after detection that the connection to the WLAN can be established, (i)cause the one or more second processors to transition from the secondpower state to the first power state, and (ii) establish a connection tothe WLAN.

(Feature 79) The method according to feature 78, further comprising:while the one or more second processors are in the first power state andthe playback device is connected to the WLAN, (i) receive second audiocontent from at least one remote computing device, and (ii) play back,using the one or more second processors and the one or more amplifiers,the second audio content.

(Feature 80) The method according to feature 79, further comprising:detecting that the connection to the WLAN has been lost; and afterdetecting that the connection to the WLAN has been lost, causing the oneor more second processors to transition from the first power state tothe second power state.

(Feature 81) One or more non-transitory computer-readable mediacomprising program instructions that are executable by a plurality ofprocessing components such that a playback device is configured toperform the method of any of features 63-80.

(Feature 82) A playback device comprising: one or more amplifiersconfigured to drive one or more speakers; one or more network interfacecomponents configured to facilitate communication over at least one datanetwork; a plurality of processing components comprising: one or morefirst processors; and one or more second processors having a differentconstruction than the one or more first processors; and at least onenon-transitory computer-readable medium according to feature 81.

(Feature 83) The playback device of feature 82, wherein the one or moresecond processors comprise an application processor and wherein the oneor more first processors do not comprise an application processor.

(Feature 84) A circuit board assembly for a playback device, the circuitboard assembly comprising: one or more circuit boards; one or moreamplifiers attached to the one or more circuit board, wherein the one ormore amplifiers are configured to drive one or more speakers; one ormore network interface components configured to facilitate communicationover at least one data network; a plurality of processing componentsattached to the one or more circuit boards, wherein the plurality ofprocessing components comprises: one or more first processors; and oneor more second processors having a different construction than the oneor more first processors; and at least one non-transitorycomputer-readable medium according to feature 81.

1-22. (canceled)
 23. A media playback system comprising: a playbackdevice comprising a first network interface configured to communicatedata over at least one network, one or more first processors, and atleast one first non-transitory computer-readable medium comprising firstprogram instructions that are executable by the one or more firstprocessors such that the playback device is configured to receive, usingthe first network interface, audio information comprising audio content;generate, using the one or more first processors, metadata associatedwith the audio content, and communicate, using the one or more firstprocessors, the audio content and the metadata to a wearable device; andthe wearable device, wherein the wearable device comprises one or moreamplifiers configured to drive one or more speakers, a second networkinterface configured to communicate data over the at least one network,one or more second processors having a different construction than theone or more first processors, at least one second non-transitorycomputer-readable medium comprising second program instructions that areexecutable by the one or more second processors such that the wearabledevice is configured to receive, using the second network interface, theaudio content and the metadata, and play back, using the one or moresecond processors and the one or more amplifiers, the audio contentbased on the metadata.
 24. The media playback system of claim 23,wherein the metadata comprises one or more of normalization informationor a codec associated with the audio content.
 25. The media playbacksystem of claim 24, wherein: the metadata comprises the codec; and thesecond program instructions are executable by the one or more secondprocessors such that the wearable device is configured to decode theaudio content using the codec.
 26. The media playback system of claim23, wherein the metadata comprises an indication of a presentation timeat which the audio content is to be played back by the wearable device.27. The media playback system of claim 23, further comprising datastorage, wherein: the first program instructions are executable by theone or more first processors such that the playback device is configuredto store, using the one or more first processors, portions of the audiocontent and the metadata to the data storage; and the second programinstructions are executable by the one or more second processors suchthat the wearable device is configured to read, using the one or moresecond processors, the portions of the audio content and the metadatafrom the data storage for playback via the one or more amplifiers. 28.The media playback system of claim 23, further comprising data storage,wherein: the first program instructions are executable by the one ormore first processors such that the playback device is configured tostore, using the one or more first processors, an entire length of audiocontent associated with the audio content and the metadata to the datastorage, and transition the one or more first processors to a low powerstate; and the second program instructions are executable by the one ormore second processors such that the wearable device is configured toread, using the one or more first processors, the entire length of theaudio content and the metadata from the data storage for playback viathe one or more amplifiers.
 29. The media playback system of claim 28,wherein to read comprises to read at least part of the entire length ofthe audio content and the metadata while the one or more secondprocessors are in the low power state.
 30. The media playback system ofclaim 23, wherein: the metadata is first metadata; the audio contentcomprises first audio content; the audio information further comprisessecond audio content; and the first program instructions are executableby the one or more first processors such that the playback device isconfigured to generate, by the one or more first processors, secondmetadata associated with the second audio content, combine the firstaudio content, the first metadata, the second audio content, and thesecond metadata into a data stream, and communicate, using the one ormore first processors, the data stream to the one or more secondprocessors.
 31. The media playback system of claim 30, wherein: the datastream comprises a plurality of packets of information, each packet ofthe plurality of packets comprising header data and payload data; theheader data for each packet comprises one of the first metadata or thesecond metadata; and the payload data of each packet comprises one of aportion of the first audio content that corresponds to the firstmetadata or a portion of the second audio content that corresponds tothe second metadata.
 32. The media playback system of claim 31, whereinthe second program instructions are executable by the one or more secondprocessors such that the wearable device is configured to process, bythe one or more second processors, audio content comprised within thepayload data of each packet according to metadata comprised within theheader of the packet.
 33. The media playback system of claim 30,wherein: the second audio content comprises a response to a voicecommand, and the second program instructions are executable by the oneor more second processors such that the wearable device is configured toreceive, by the one or more second processors and via a microphoneassociated with the wearable device, third audio content associated withthe voice command, communicate, by the one or more second processors,the third audio content to the one or more first processors, andcommunicate, by the one or more first processors, the third audiocontent to a server for processing a command associated with the thirdaudio content.
 34. The media playback system of claim 23, wherein thesecond program instructions are executable by the one or more secondprocessors such that the wearable device is configured to adjust, by theone or more second processors, a playback rate associated with playbackof the audio content based on a rate at which the audio content iscommunicated to the one or more second processors by the one or morefirst processors.
 35. The media playback system of claim 23, wherein thewearable device is a headphone device and further comprises at least oneearpiece.
 36. The media playback system of claim 35, wherein: theheadphone device comprises at least one microphone; and the headphonedevice is a hearable device configured to play back an amplified versionof at least some sound detected using the at least one microphone. 37.The media playback system of claim 23, wherein the playback device is ascreenless playback device that does not comprise a display screen. 38.The media playback system of claim 23, further comprising an applicationprocessor disposed within the one or more first processors.
 39. Themedia playback system of claim 23, wherein the one or more secondprocessors comprise processors other than an application processor. 40.A method of playing back audio using a media playback system comprisinga playback device and a wearable device distinct from the playbackdevice, the method comprising: receiving, using a network interface ofthe playback device, audio information comprising audio content;generating, using one or more first processors of the playback device,metadata associated with the audio content; communicating, using the oneor more first processors, the audio content and the metadata to thewearable device; and playing back, using one or more second processorsof the wearable device and one or more amplifiers of the wearabledevice, the audio content based on the metadata.
 41. The method of claim40, wherein: generating the metadata comprises generating metadatacomprising a codec associated with the audio content; and playing backthe audio content comprises decoding the audio content using the codec.42. One or more non-transitory computer readable media storinginstructions executable by one or more processors to cause a mediaplayback system comprising a playback device and a wearable devicedistinct from the playback device to play back audio, the instructionscomprising instructions to: receive, using a network interface of theplayback device, audio information comprising audio content; generate,using one or more first processors of the playback device, metadataassociated with the audio content; communicate, using the one or morefirst processors, the audio content and the metadata to the wearabledevice; and play back, using one or more second processors of thewearable device and one or more amplifiers of the wearable device, theaudio content based on the metadata.